4
DECIDABILITY
In Chapter 3 we introduced the Turing machine as a model of a general purpose computer and defined the notion of algorithm in terms of Turing machines by means of the Church-Turing thesis.
In this chapter we begin to investigate the power of algorithms to solve problems. We demonstrate certain problems that can be solved algorithmically and others that cannot. Our objective is to explore the limits of algorithmic solvability. You are probably familiar with solvability by algorithms because much of computer science is devoted to solving problems. The unsolvability of certain problems may come as a surprise.
Why should you study unsolvability? After all, showing that a problem is unsolvable doesn't appear to be of any use if you have to solve it. You need to study this phenomenon for two reasons. First, knowing when a problem is algorithmically unsolvable is useful because then you realize that the problem must be simplified or altered before you can find an algorithmic solution. Like any tool, computers have capabilities and limitations that must be appreciated if they are to be used well. The second reason is cultural. Even if you deal with problems that clearly are solvable, a glimpse of the unsolvable can stimulate your imagination and help you gain an important perspective on computation.
165
166
CHAPTER 4 / DECIDABILITY
4.1
DECIDABLE LANGUAGES
In this section we give some examples of languages that are decidable by algorithms. We focus on languages concerning automata and grammars. For example, we present an algorithm that tests whether a string is a member of a context-free language (CFL). These languages are interesting for several reasons. First, certain problems of this kind are related to applications. This problem of testing whether a CFL generates a string is related to the problem of recognizing and compiling programs in a programming language. Second, certain other problems concerning automata and grammars are not decidable by algorithms. Starting with examples where decidability is possible helps you to appreciate the undecidable examples.
DECIDABLE PROBLEMS CONCERNING REGULAR LANGUAGES
We begin with certain computational problems concerning finite automata. We give algorithms for testing whether a finite automaton accepts a string, whether the language of a finite automaton is empty, and whether two finite automata are equivalent.
Note that we chose to represent various computational problems by languages. Doing so is convenient because we have already set up terminology for dealing with languages. For example, the acceptance problem for DFAs of testing whether a particular deterministic finite automaton accepts a given string can be expressed as a language, .Adfa- This language contains the encodings of all DFAs together with strings that the DFAs accept. Let
^-dfa = {{B, w)\ B is a DFA that accepts input string w}.
The problem of testing whether a DFA B accepts an input w is the same as the problem of testing whether {B, w) is a member of the language ^dfa- Similarly, we can formulate other computational problems in terms of testing membership in a language. Showing that the language is decidable is the same as showing that the computational problem is decidable.
In the following theorem we show that Aqfa is decidable. Hence this theorem shows that the problem of testing whether a given finite automaton accepts a given string is decidable.
theorem 4.1 ..................
j4dfa is a decidable language.
4.1      DECIDABLE LANGUAGES 167
proof idea   We simply need to present a TM M that decides ^Idfa-
M — "On input (B, w), where B is a DFA and w is a string:
1. Simulate B on input w.
2. If the simulation ends in an accept state, accept. If it ends in a nonaccepting state, reject."
proof We mention just a few implementation details of this proof. For those of you familiar with writing programs in any standard programming language, imagine how you would write a program to carry out the simulation.
First, let's examine the input (B, w). It is a representation of a DFA B together with a string w. One reasonable representation of B is simply a list of its five components, Q, £, 8, <jo, and F. When M receives its input, M first determines whether it properly represents a DFA B and a string w. If not, M rejects.
Then M carries out the simulation directly. It keeps track of B's current state and B's current position in the input w by writing this information down on its tape. Initially, B's current state is <7o and B's current input position is the leftmost symbol of w. The states and position are updated according to the specified transition function 5. When M finishes processing the last symbol of w, M accepts the input if B is in an accepting state; M rejects the input if B is in a nonaccepting state.
We can prove a similar theorem for nondeterministic finite automata. Let ^-nfa = {{B, w)\ B is an NFA that accepts input string w}.
theorem 4.2 ............................................................................................................................
j4nfa is a decidable language.
proof We present a TM TV that decides ^4nfa- We could design N to operate like M, simulating an NFA instead of a DFA. Instead, we'll do it differently to illustrate a new idea: have N use M as a subroutine. Because M is designed to work with DFAs, TV first converts the NFA it receives as input to a DFA before passing it to M.
TV = "On input (B, w) where B is an NFA, and w is a string:
1. Convert NFA B to an equivalent DFA C, using the procedure for this conversion given in Theorem 1.39.
2. Run TM M from Theorem 4.1 on input (C, w).
3. If M accepts, accept; otherwise, reject."
Running TM M in stage 2 means incorporating M into the design of TV as a subprocedure.
168       CHAPTER 4 / DECIDABILITY
Similarly, we can determine whether a regular expression generates a given string. Let j4rex = {(-ft, u>)\ R is a regular expression that generates string w}.
theorem 4.3
j4rex is a decidable language.
proof   The following TM P decides ^Irex-
P = "On input (R, w) where R is a regular expression and w is a string:
1. Convert regular expression R to an equivalent NFA A by using the procedure for this conversion given in Theorem 1.54.
2. Run TM TV on input {A, w).
3. If TV accepts, accept; if TV rejects, reject."
Theorems 4.1, 4.2, and 4.3 illustrate that, for decidability purposes, presenting the Turing machine with a DFA, NFA, or regular expression are all equivalent because the machine is able to convert one form of encoding to another.
Now we turn to a different kind of problem concerning finite automata: emptiness testing for the language of a finite automaton. In the preceding three theorems we had to determine whether a finite automaton accepts a particular string. In the next proof we must determine whether a finite automaton accepts any strings at all. Let
£dfa = {{A)\ A is a DFA and L(A) = 0}.
theorem 4.4 ............................................................................................................................
£dfa is a decidable language.
proof A DFA accepts some string iff reaching an accept state from the start state by traveling along the arrows of the DFA is possible. To test this condition we can design a TM T that uses a marking algorithm similar to that used in Example 3.23.
T = "On input (A) where A is a DFA:
1. Mark the start state of A.
2. Repeat until no new states get marked:
3. Mark any state that has a transition coming into it from any state that is already marked.
4. If no accept state is marked, accept; otherwise, reject."
4.1      DECIDABLE LANGUAGES 169
The next theorem states that determining whether two DFAs recognize the same language is decidable. Let
EQDFA = {{A,B)\ A and B are DFAs and L(A) = L(B)}.
THEOREM 4.5 ............................................................................................................................
EQDfk is a decidable language.
PROOF To prove this theorem we use Theorem 4.4. We construct a new DFA C from A and B, where C accepts only those strings that are accepted by either A or B but not by both. Thus, if A and B recognize the same language, C will accept nothing. The language of C is
l(C) = (l(A) n l(b?) u (pA) n L(B)).
This expression is sometimes called the symmetric difference of L(A) and L(B) and is illustrated in the following figure. Here L{A)is the complement of L(A). The symmetric difference is useful here because L(C) = 0 iff L(A) = L{B). We can construct C from A and B with the constructions for proving the class of regular languages closed under complementation, union, and intersection. These constructions are algorithms that can be carried out by Turing machines. Once we have constructed C we can use Theorem 4.4 to test whether L(C) is empty. If it is empty, L(A) and L(B) must be equal.
F = "On input {A, B), where A and B are DFAs:
1. Construct DFA C as described.
2. Run TM T from Theorem 4.4 on input (C).
3. If T accepts, accept. If T rejects, reject."
FIGURE 4.6
The symmetric difference of L{A) and L{B)
170
CHAPTER 4 / DECIDABILITY
DECIDABLE PROBLEMS CONCERNING CONTEXT-FREE LANGUAGES
Here, we describe algorithms to determine whether a CFG generates a particular string and to determine whether the language of a CFG is empty. Let
Aqfg — {(G, w) G is a CFG that generates string w}.
theorem 4.7 ............................................................................................................................
Aqpq is a decidable language.
proof idea For CFG G and string w we want to determine whether G generates w. One idea is to use G to go through all derivations to determine whether any is a derivation of w. This idea doesn't work, as infinitely many derivations may have to be tried. If G does not generate w, this algorithm would never halt. This idea gives a Turing machine that is a recognizer, but not a decider, for Aqpq.
To make this Turing machine into a decider we need to ensure that the algorithm tries only finitely many derivations. In Problem 2.26 (page 130) we showed that, if G were in Chomsky normal form, any derivation of w has 2n — 1 steps, where n is the length of w. In that case checking only derivations with 2n — 1 steps to determine whether G generates w would be sufficient. Only finitely many such derivations exist. We can convert G to Chomsky normal form by using the procedure given in Section 2.1.
proof   The TM S for ^cfg follows.
S = "On input (G, w), where G is a CFG and w is a string:
1. Convert G to an equivalent grammar in Chomsky normal form.
2. List all derivations with 2n — 1 steps, where n is the length of w, except if n = 0, then instead list all derivations with 1 step.
3. If any of these derivations generate w, accept; if not, reject."
The problem of determining whether a CFG generates a particular string is related to the problem of compiling programming languages. The algorithm in TM S is very inefficient and would never be used in practice, but it is easy to describe and we aren't concerned with efficiency here. In Part Three of this book we address issues concerning the running time and memory use of algorithms. In the proof of Theorem 7.16, we describe a more efficient algorithm for recognizing context-free languages.
4.1      DECIDABLE LANGUAGES 171
Recall that we have given procedures for converting back and forth between CFGs and PDAs in Theorem 2.20. Hence everything we say about the decidability of problems concerning CFGs applies equally well to PDAs.
Let's turn now to the emptiness testing problem for the language of a CFG. As we did for DFAs, we can show that the problem of determining whether a CFG generates any strings at all is decidable. Let
Eqfc = {(G)| GisaCFGandL(G) = 0}.
theorem 4.8
£cfg is a decidable language.
proof idea To find an algorithm for this problem we might attempt to use TM S from Theorem 4.7. It states that we can test whether a CFG generates some particular string w. To determine whether L{G) = 0 the algorithm might try going through all possible w's, one by one. But there are infinitely many w's to try, so this method could end up running forever. We need to take a different approach.
In order to determine whether the language of a grammar is empty, we need to test whether the start variable can generate a string of terminals. The algorithm does so by solving a more general problem. It determines for each variable whether that variable is capable of generating a string of terminals. When the algorithm has determined that a variable can generate some string of terminals, the algorithm keeps track of this information by placing a mark on that variable.
First, the algorithm marks all the terminal symbols in the grammar. Then, it scans all the rules of the grammar. If it ever finds a rule that permits some variable to be replaced by some string of symbols all of which are already marked, the algorithm knows that this variable can be marked, too. The algorithm continues in this way until it cannot mark any additional variables. The TM R implements this algorithm.
proof
R = "On input (G), where G is a CFG:
1. Mark all terminal symbols in G.
2. Repeat until no new variables get marked:
3. Mark any variable A where G has a rule A —> U1II2 ■ ■ ■ Uk and each symbol Ui,...,Uk has already been marked.
4. If the start variable is not marked, accept; otherwise, reject."
172
CHAPTER 4 / DECIDABILITY
Next we consider the problem of determining whether two context-free grammars generate the same language. Let
eQcfg = {(g-,h)\g and h are CFGs and L(G) = L(H)}.
Theorem 4.5 gave an algorithm that decides the analogous language EQD¥A for finite automata. We used the decision procedure for Eqfa to prove that EQDFA is decidable. Because Eqfg also is decidable, you might think that we can use a similar strategy to prove that EQCFG is decidable. But something is wrong with this idea! The class of context-free languages is not closed under complementation or intersection, as you proved in Exercise 2.2. In fact, EQCFG is not decidable. The technique for proving so is presented in Chapter 5.
Now we show that every context-free language is decidable by a Turing machine.
theorem 4.9
Every context-free language is decidable.
proof idea Let A be a CFL Our objective is to show that A is decidable. One (bad) idea is to convert a PDA for A directly into a TM. That isn't hard to do because simulating a stack with the TM's more versatile tape is easy. The PDA for A may be nondeterministic, but that seems okay because we can convert it into a nondeterministic TM and we know that any nondeterministic TM can be converted into an equivalent deterministic TM. Yet, there is a difficulty. Some branches of the PDA's computation may go on forever, reading and writing the stack without ever halting. The simulating TM then would also have some non-halting branches in its computation, and so the TM would not be a decider. A different idea is necessary. Instead, we prove this theorem with the TM S that we designed in Theorem 4.7 to decide Aqfq.
proof Let G be a CFG for A and design a TM Mq that decides A. We build a copy of G into Mq. It works as follows.
MG = "On input w:
1. Run TM S on input (G, w)
2. If this machine accepts, accept; if it rejects, reject."
Theorem 4.9 provides the final link in the relationship among the four main classes of languages that we have described so far: regular, context free, decidable, and Turing-recognizable. The following figure depicts this relationship.
4.2     THE HALTING PROBLEM 173
4.2
THE HALTING PROBLEM
In this section we prove one of the most philosophically important theorems of the theory of computation: There is a specific problem that is algorithmically unsolvable. Computers appear to be so powerful that you may believe that all problems will eventually yield to them. The theorem presented here demonstrates that computers are limited in a fundamental way.
What sort of problems are unsolvable by computer? Are they esoteric, dwelling only in the minds of theoreticians? No! Even some ordinary problems that people want to solve turn out to be computationally unsolvable.
In one type of unsolvable problem, you are given a computer program and a precise specification of what that program is supposed to do (e.g., sort a list of numbers). You need to verify that the program performs as specified (i.e., that it is correct). Because both the program and the specification are mathematically precise objects, you hope to automate the process of verification by feeding these objects into a suitably programmed computer. However, you will be disappointed. The general problem of software verification is not solvable by computer.
In this section and Chapter 5 you will encounter several computationally unsolvable problems. Our objectives are to help you develop a feel for the types of problems that are unsolvable and to learn techniques for proving unsolvability.
Now we turn to our first theorem that establishes the undecidability of a specific language: the problem of determining whether a Turing machine accepts a given input string. We call it Ajm by analogy with Aqfa and ^cfg- But, whereas
174
CHAPTER 4 / DECIDABILITY
Aqfa and Acfg were decidable, Ajm is not. Let
ATM = {(M. w) | M is a TM and Af accepts ui}.
theorem 4.11 .........................................................................................................................
Ajm is undecidable.
Before we get to the proof, let's first observe that Ajm is Turing-recognizable. Thus this theorem shows that recognizers are more powerful than deciders. Requiring a TM to halt on all inputs restricts the kinds of languages that it can recognize. The following Turing machine U recognizes ^tm-
U = "On input (M, w), where M is a TM and w is a string:
1. Simulate M on input w.
2. If M ever enters its accept state, accept; if M ever enters its reject state, reject."
Note that this machine loops on input (M, w) if M loops on w, which is why this machine does not decide ^4tm- If the algorithm had some way to determine that M was not halting on w, it could reject. Hence Ajm is sometimes called the halting problem. As we demonstrate, an algorithm has no way to make this determination.
The Turing machine U is interesting in its own right. It is an example of the universal Turing machine first proposed by Turing. This machine is called universal because it is capable of simulating any other Turing machine from the description of that machine. The universal Turing machine played an important early role in stimulating the development of stored-program computers.
THE DIAGONALIZATION METHOD
The proof of the undecidability of the halting problem uses a technique called diagonalization, discovered by mathematician Georg Cantor in 1873. Cantor was concerned with the problem of measuring the sizes of infinite sets. If we have two infinite sets, how can we tell whether one is larger than the other or whether they are of the same size? For finite sets, of course, answering these questions is easy. We simply count the elements in a finite set, and the resulting number is its size. But, if we try to count the elements of an infinite set, we will never finish! So we can't use the counting method to determine the relative sizes of infinite sets.
For example, take the set of even integers and the set of all strings over {0,1}. Both sets are infinite and thus larger than any finite set, but is one of the two larger than the other? How can we compare their relative size?
Cantor proposed a rather nice solution to this problem. He observed that two finite sets have the same size if the elements of one set can be paired with the elements of the other set. This method compares the sizes without resorting to counting. We can extend this idea to infinite sets. Let's see what it means more precisely.
4.2
THE HALTING PROBLEM
175
definition 4.12
Assume that we have sets A and B and a function / from A to B. Say that / is one-to-one if it never maps two different elements to the same place—that is, if f(a) ^ fib) whenever a ^ b. Say that / is onto if it hits every element of B—that is, if for every b £ B there is an a e A such that f(a) = b. Say that A and B are the same size if there is a one-to-one, onto function /: A—>B. A function that is both one-to-one and onto is called a correspondence. In a correspondence every element of A maps to a unique element of B and each element of B has a unique element of A mapping to it. A correspondence is simply a way of pairing the elements of A with the elements of B.
example 4.13 ..........................................................................................................................
Let N be the set of natural numbers {1,2,3,...} and let £ be the set of even natural numbers {2,4, 6,... }. Using Cantor's definition of size we can see that M and £ have the same size. The correspondence / mapping TV to £ is simply f{n) = 2n. We can visualize / more easily with the help of a table.
Of course, this example seems bizarre. Intuitively, £ seems smaller than Af because £ is a proper subset of Af. But pairing each member of Af with its own member of £ is possible, so we declare these two sets to be the same size.
definition 4.14
A set A is countable if either it is finite or it has the same size as Af.
example 4.15 ..........................................................................................................................
Now we turn to an even stranger example. If we let Q = | m, n e Af} be the set of positive rational numbers, Q seems to be much larger than Af. Yet these two sets are the same size according to our definition. We give a correspondence with Af to show that Q is countable. One easy way to do so is to list all the elements of Q. Then we pair the first element on the list with the number 1 from Af, the second element on the list with the number 2 from Af, and so on. We must ensure that every member of Q appears only once on the list.
n
fin)
1
2
3
2
4 6
176
CHAPTER 4 / DECIDABILITY
To get this list we make an infinite matrix containing all the positive rational numbers, as shown in Figure 4.16. The zth row contains all numbers with numerator i and the jth column has all numbers with denominator j. So the
number * occurs in the ith row and ?th column.
j J
Now we turn this matrix into a list. One (bad) way to attempt it would be to begin the list with all the elements in the first row. That isn't a good approach because the first row is infinite, so the list would never get to the second row. Instead we list the elements on the diagonals, starting from the corner, which are superimposed on the diagram. The first diagonal contains the single element j, and the second diagonal contains the two elements \ and \. So the first three elements on the list are \, |, and \. In the third diagonal a complication arises. It contains f, §, and \. If we simply added these to the list, we would repeat \ = |. We avoid doing so by skipping an element when it would cause a repetition. So we add only the two new elements \ and |. Continuing in this way we obtain a list of all the elements of Q.
figure 4.16
A correspondence of N and Q
After seeing the correspondence of M and Q, you might think that any two infinite sets can be shown to have the same size. After all, you need only demonstrate a correspondence, and this example shows that surprising correspondences do exist. However, for some infinite sets no correspondence with M exists. These sets are simply too big. Such sets are called uncountable.
The set of real numbers is an example of an uncountable set. A real number is one that has a decimal representation. The numbers it = 3.1415926... and V2 = 1.4142135... are examples of real numbers. Let 1Z be the set of real numbers. Cantor proved that 1Z is uncountable. In doing so he introduced the diagonalization method.
4.2     THE HALTING PROBLEM 177
theorem 4.17
TZ is uncountable.
proof In order to show that TZ is uncountable, we show that no correspondence exists between TV" and TZ. The proof is by contradiction. Suppose that a correspondence / existed between Af and TZ. Our job is to show that / fails to work as it should. For it to be a correspondence, / must pair all the members of TV" with all the members of TZ. But we will find an x in TZ that is not paired with anything in Af, which will be our contradiction.
The way we find this x is by actually constructing it. We choose each digit of x to make x different from one of the real numbers that is paired with an element of Af. In the end we are sure that x is different from any real number that is paired.
We can illustrate this idea by giving an example. Suppose that the correspondence / exists. Let /(l) = 3.14159 ..., /(2) = 55.55555..., /(3) and so on, just to make up some values for /. Then / pairs the number 1 with 3.14159..., the number 2 with 55.55555..., and so on. The following table shows a few values of a hypothetical correspondence / between Af and TZ.
We construct the desired x by giving its decimal representation. It is a number between 0 and 1, so all its significant digits are fractional digits following the decimal point. Our objective is to ensure that x ^ f(n) for any n. To ensure that x 7^ /(l) we let the first digit of x be anything different from the first fractional
digit 1 of/(l) = 3.14159----Arbitrarily, we let it be 4. To ensure that x ^ /(2)
we let the second digit of x be anything different from the second fractional digit 5 of /(2) = 55.555555 .... Arbitrarily, we let it be 6. The third fractional digit of /(3) = 0.12345 ... is 3, so we let x be anything different—say, 4. Continuing in this way down the diagonal of the table for /, we obtain all the digits of x, as shown in the following table. We know that x is not f(n) for any n because it differs from f(n) in the nth fractional digit. (A slight problem arises because certain numbers, such as 0.1999 ... and 0.2000 ..., are equal even though their decimal representations are different. We avoid this problem by never selecting the digits 0 or 9 when we construct x.)
n
f(n)
1
2
3
4
3.14159... 55.55555... 0.12345... 0.50000...
178
CHAPTER 4 / DECIDABILITY
n	f(n)	
1	3	14159...
2	55	55555...
3	0	12345...
4	0	50000...
The preceding theorem has an important application to the theory of computation. It shows that some languages are not decidable or even Turing-recognizable, for the reason that there are uncountably many languages yet only countably many Turing machines. Because each Turing machine can recognize a single language and there are more languages than Turing machines, some languages are not recognized by any Turing machine. Such languages are not Turing-recognizable, as we state in the following corollary.
corollary 4.18
Some languages are not Turing-recognizable.
proof To show that the set of all Turing machines is countable we first observe that the set of all strings E* is countable, for any alphabet E. With only finitely many strings of each length, we may form a list of E* by writing down all strings of length 0, length 1, length 2, and so on.
The set of all Turing machines is countable because each Turing machine M has an encoding into a string (M). If we simply omit those strings that are not legal encodings of Turing machines, we can obtain a list of all Turing machines.
To show that the set of all languages is uncountable we first observe that the set of all infinite binary sequences is uncountable. An infinite binary sequence is an unending sequence of 0s and Is. Let B be the set of all infinite binary sequences. We can show that B is uncountable by using a proof by diagonalization similar to the one we used in Theorem 4.17 to show that 1Z is uncountable.
Let C be the set of all languages over alphabet E. We show that C is uncountable by giving a correspondence with B, thus showing that the two sets are the same size. Let E* = {si, s2, S3,... }. Each language A € L has a unique sequence in B. The ith bit of that sequence is a 1 if s, G A and is a 0 if si A, which is called the characteristic sequence of A. For example, if A were the language of all strings starting with a 0 over the alphabet {0,1}, its characteristic sequence xa would be
E* = { e , 0 , 1 , 00 , 01 , 10 , 11 , 000 , 001 , • • • } ; A = { 0   , 00 , 01 , 000 , 001 , • • • } ;
Xa=     0     1     0     1     1     0     0     1     1     ■■• .
The function /: £—>B, where f(A) equals the characteristic sequence of A, is one-to-one and onto and hence a correspondence. Therefore, as B is un-
4.2     THE HALTING PROBLEM 179
countable, L is uncountable as well.
Thus we have shown that the set of all languages cannot be put into a correspondence with the set of all Turing machines. We conclude that some languages are not recognized by any Turing machine.
THE HALTING PROBLEM IS UNDECIDABLE
Now we are ready to prove Theorem 4.11, the undecidability of the language -4tm = {(M, w) \ M is a TM and M accepts w}.
proof We assume that Ajm is decidable and obtain a contradiction. Suppose that H is a decider for ^4tm- On input (M, w), where M is a TM and w is a string, H halts and accepts if M accepts w. Furthermore, H halts and rejects if M fails to accept w. In other words, we assume that H is a TM, where
Now we construct a new Turing machine D with H as a subroutine. This new TM calls H to determine what M does when the input to M is its own description (M). Once D has determined this information, it does the opposite. That is, it rejects if M accepts and accepts if M does not accept. The following is a description of D.
D = "On input (M), where M is a TM:
1. Run H on input (M, {M)).
2. Output the opposite of what H outputs; that is, if H accepts, reject and if H rejects, accept."
Don't be confused by the idea of running a machine on its own description! That is similar to running a program with itself as input, something that does occasionally occur in practice. For example, a compiler is a program that translates other programs. A compiler for the language Pascal may itself be written in Pascal, so running that program on itself would make sense. In summary,
What happens when we run D with its own description (D) as input? In that case we get
No matter what D does, it is forced to do the opposite, which is obviously a contradiction. Thus neither TM D nor TM H can exist.
accept   if M accepts w
reject    if M does not accept w.
accept if M does not accept (M) reject    if M accepts (M).
accept if D does not accept (D) reject    if D accepts (D).
180       CHAPTER 4 / DECIDABILITY
Let's review the steps of this proof. Assume that a TM H decides Ajm- Then use H to build a TM D that when given input (M) accepts exactly when M does not accept input (M). Finally, run D on itself. The machines take the following actions, with the last line being the contradiction.
• H accepts (M, w) exactly when M accepts w.
• D rejects (M) exactly when M accepts (M).
• D rejects (D) exactly when D accepts (D).
Where is the diagonalization in the proof of Theorem 4.11 ? It becomes apparent when you examine tables of behavior for TMs H and D. In these tables we list all TMs down the rows, Mi, M2, . ■ • and all their descriptions across the columns, (Mi), (M2), ... The entries tell whether the machine in a given row accepts the input in a given column. The entry is accept if the machine accepts the input but is blank if it rejects or loops on that input. We made up the entries in the following figure to illustrate the idea.
figure 4.19
Entry i,j is accept if M, accepts (Mj)
In the following figure the entries are the results of running H on inputs corresponding to Figure 4.18. So if M3 does not accept input (M2), the entry for row M3 and column (M2) is reject because H rejects input (M3, (M2)).
(Mi)     (M2)     (M3) (M4)
Mi M2
M3 Mi
accept accept
accept   accept   accept accept
accept accept
(Afi)     (M2)     (M3) (M4)
Mi M2 M3 M4
accept reject accept reject
accept accept accept accept
reject reject reject reject
accept accept reject reject
figure 4.20
Entry i, j is the value of H on input (Mj, (Mj))
4.2     THE HALTING PROBLEM 181
In the following figure, we added D to Figure 4.19. By our assumption, H is a TM and so is D. Therefore it must occur on the list M\, M2, ... of all TMs. Note that D computes the opposite of the diagonal entries. The contradiction occurs at the point of the question mark where the entry must be the opposite of itself.
	(Mi)	<M2)	(M3)	<M4) ■	■ (D)
Mi	accept	reject	accept	reject	accept
M2	accept	accept	accept	accept	accept
M3	reject	reject	reject	reject	reject
M4	accept	accept	reject	reject	accept
D	reject	reject	accept	accept	?
figure 4.21
If D is in the figure, a contradiction occurs at "?"
A TURING-UNRECOGNIZABLE LANGUAGE
In the preceding section we demonstrated a language—namely, Ajm—that is un-decidable. Now we demonstrate a language that isn't even Turing-recognizable. Note that Ajm will not suffice for this purpose because we showed that Ajm is Turing-recognizable (page 174). The following theorem shows that, if both a language and its complement are Turing-recognizable, the language is decid-able. Hence, for any undecidable language, either it or its complement is not Turing-recognizable. Recall that the complement of a language is the language consisting of all strings that are not in the language. We say that a language is co-Turing-recognizable if it is the complement of a Turing-recognizable language.
theorem 4.22 .........................................................................................................................
A language is decidable iff it is Turing-recognizable and co-Turing-recognizable.
In other words, a language is decidable exactly when both it and its complement are Turing-recognizable.
proof We have two directions to prove. First, if A is decidable, we can easily see that both A and its complement A are Turing-recognizable. Any decidable language is Turing-recognizable, and the complement of a decidable language also is decidable.
For the other direction, if both A and A are Turing-recognizable, we let M\ be the recognizer for A and M2 be the recognizer for A. The following Turing
182       CHAPTER 4 / DECIDABILITY
machine M is a decider for A.
M = "On input w.
1. Run both Mi and M2 on input w in parallel.
2. If Mi accepts, accept; if M2 accepts, reject."
Running the two machines in parallel means that M has two tapes, one for simulating Mi and the other for simulating M2. In this case M takes turns simulating one step of each machine, which continues until one of them accepts.
Now we show that M decides A. Every string w is either in A or A. Therefore either Mi or M2 must accept w. Because M halts whenever Mi or M2 accepts, M always halts and so it is a decider. Furthermore, it accepts all strings in A and rejects all strings not in A. So M is a decider for A, and thus A is decidable.
corollary 4.23
Ajm is not Turing-recognizable.
proof We know that ^4jm is Turing-recognizable. If ^4tm also were Turing-recognizable, ^4tm would be decidable. Theorem 4.11 tells us that ^4tm is not decidable, so Ajm must not be Turing-recognizable.
A4.1 Answer all parts for the following DFA M and give reasons for your answers.
EXERCISES
1
0,1
a. Is (M,0100) e Adfa?
b. Is (M,011) £ ,4dfa?
c. Is (M) E Adfa?
d. Is (Af,0100) e Arex?
e. Is (M) 6 £dfa?
f. Is (M, M) 6 EQDFA?
PROBLEMS
183
4.2 Consider the problem of determining whether a DFA and a regular expression are equivalent. Express this problem as a language and show that it is decidable.
4.3 Let AIIdfa = {(A) \ Aisa DFA and L(A) = £*}. Show that ALLDFa is decidable.
4.4 Let Asqfg = {(G) j G is a CFG that generates e}. Show that Ascfg is decidable.
4.5 Let X be the set {1, 2, 3, 4, 5} and Y be the set {6, 7, 8, 9,10}. We describe the functions /: X—>F and g: X—>Y in the following tables. Answer each part and give a reason for each negative answer.
n	f(n)	n	
1	6	1	10
2	7	2	9
3	6	3	8
4	7	4	7
5	6	5	6
Aa. Is / one-to-one? Ad. Is g one-to-one?
b. Is / onto? e. Is g onto?
c. Is / a correspondence? f. Is g a correspondence?
4.6 Let B be the set of all infinite sequences over {0,1}. Show that B is uncountable, using a proof by diagonalization.
4.7 Let T =        k) \      k E TV}. Show that T is countable.
4.8 Review the way that we define sets to be the same size in Definition 4.12 (page 175). Show that "is the same size" is an equivalence relation.
PROBLEMS
A4.9 Let INFINITEdfa = {{A)\ A is a DFA and L(A) is an infinite language}. Show that INFINITEdfa is decidable.
4.10 Let INFINITEpoa = {(M) \ M is a PDA and L{M) is an infinite language}. Show that INFINITEpua is decidable.
A4.11 Let A = {(M)\ M is a DFA which doesn't accept any string containing an odd number of Is}. Show that A is decidable.
4.12 Let A = {(R, S}\ R and S are regular expressions and L(R) C L(S)}. Show that A is decidable.
A4.13 Let E = {0,1}. Show that the problem of determining whether a CFG generates some string in 1* is decidable. In other words, show that
{(G)| G is a CFG over {0,1} and 1* n L{G) / 0}
is a decidable language.
*4.14 Show that the problem of determining whether a CFG generates all strings in 1* is decidable. In other words, show that {(G) G is a CFG over {0,1} and 1* C L(G)} is a decidable language.
1 84       CHAPTER 4 / DECIDABILITY
4.15 Let A = {(R) J R is a regular expression describing a language containing at least one string w that has 111 as a substring (i.e., w = xllly for some x and y)}. Show that A is decidable.
4.16 Prove that EQDFA is decidable by testing the two DFAs on all strings up to a certain size. Calculate a size that works.
*4.17 Let C be a language. Prove that C is Turing-recognizable iff a decidable language D exists such that C = {a;| By {(x, y) £ D)}.
4.18 Let A and B be two disjoint languages. Say that language C separates A and B if A C C and B C C. Show that any two disjoint co-Turing-recognizable languages are separable by some decidable language.
4.19 Let S = {(M) j M is a DFA that accepts wK whenever it accepts w}. Show that 5 is decidable.
4.20 A language is prefix-free if no member is a proper prefix of another member. Let PREFIX-FREErex = {R\ R is a regular expression where L(R) is prefix-free}. Show that PREFIX-FREErex is decidable. Why does a similar approach fail to show that PREFIX-FREEqfg is decidable?
*4.21 Say that an NFA is ambiguous if it accepts some string along two different computation branches. Let AMBIGnfa = {{N)\ N is an ambiguous NFA}. Show that AMBIGnfa is decidable. (Suggestion: One elegant way to solve this problem is to construct a suitable DFA and then run £dfa on it.)
4.22 A useless state in a pushdown automaton is never entered on any input string. Consider the problem of determining whether a pushdown automaton has any useless states. Formulate this problem as a language and show that it is decidable.
*4.23 Let BALofa = {{M)\ M is a DFA that accepts some string containing an equal number of 0s and Is}. Show that BALdfa is decidable. (Hint: Theorems about CFLs are helpful here.)
*4.24 Let PALdfa = {{M}\ M is a DFA that accepts some palindrome}. Show that PALdfa is decidable. (Hint: Theorems about CFLs are helpful here.)
*4.25 Let E = {(M) \ M is a DFA that accepts some string with more Is than Os}. Show that E is decidable. (Hint: Theorems about CFLs are helpful here.)
4.26 Let C = {(G, x)\ G is a CFG that generates some string w, where a; is a substring of w}. Show that C is decidable. (Suggestion: An elegant solution to this problem uses the decider for Eqfg)
4.27 Let Ccfg = {(G,k)\ L(G) contains exactly k strings where k > 0 or k = oo}. Show that Ccfg is decidable.
4.28 Let A be a Turing-recognizable language consisting of descriptions of Turing machines, {(Mi), (M2), ... }, where every Mi is a decider. Prove that some decidable language D is not decided by any decider Mt whose description appears in A. (Hint: You may find it helpful to consider an enumerator for A)
SELECTED SOLUTIONS
185
SELECTED SOLUTIONS
4.1  (a) Yes. The DFA M accepts 0100.
(b) No. M doesn't accept Oil.
(c) No. This input has only a single component and thus is not of the correct form.
(d) No. The first component is not a regular expression and so the input is not of the correct form.
(e) No. M's language isn't empty.
(f) Yes. M accepts the same language as itself.
4.5 (a) No, / is not one-to-one because /(l) = /(3). (d) Yes, g is one-to-one.
4.9 The following TM I decides INFINITEDFA.
I = "On input (A) where A is a DFA:
1. Let k be the number of states of A.
2. Construct a DFA D that accepts all strings of length k or more.
3. Construct a DFA M such that L(M) = L(A) n L(D).
4. Test L(M) = 0, using the £qfa decider T from Theorem 4.4.
5. If T accepts, reject; if T rejects, accept."
This algorithm works because a DFA which accepts infinitely many strings must accept arbitrarily long strings. Therefore this algorithm accepts such DFAs. Conversely, if the algorithm accepts a DFA, the DFA accepts some string of length k or more, where k is the number of states of the DFA. This string may be pumped in the manner of the pumping lemma for regular languages to obtain infinitely many accepted strings.
4.11 The following TM decides A.
"On input (M):
1. Construct a DFA O that accepts every string containing an odd number of Is.
2. Construct DFA B such that L{B) = L(M) n L(O).
3. Test whether L(B) = 0, using the Edfa decider T from Theorem 4.4.
4. If T accepts, accept; if T rejects, reject."
4.13 You showed in Problem 2.18 that, if C is a context-free language and R is a regular language, then C n R is context free. Therefore 1* n L(G) is context free. The following TM decides A.
"On input <G):
1. Construct CFG H such that L(H) = 1* n L{G).
2. Test whether L(H) = 0, using the Ecfg decider R from Theorem 4.8.
3. If R accepts, reject; if R rejects, accept."
186       CHAPTER 4 / DECIDABILITY
4.21 The following procedure decides AMBIGufa- Given an NFA N, we design a DFA D that simulates N and accepts a string iff it is accepted by N along two different computational branches. Then we use a decider for Edfa to determine whether D accepts any strings.
Our strategy for constructing D is similar to the NFA to DFA conversion in the proof of Theorem 1.39. We simulate N by keeping a pebble on each active state. We begin by putting a red pebble on the start state and on each state reachable from the start along e transitions. We move, add, and remove pebbles in accordance with N's transitions, preserving the color of the pebbles. Whenever two or more pebbles are moved to the same state, we replace its pebbles with a blue pebble. After reading the input, we accept if a blue pebble is on an accept states of N. The DFA D has a state corresponding to each possible position of pebbles. For each state of N, three possibilities occur: it can contain a red pebble, a blue pebble, or no pebble. Thus, if N has n states, D will have 3n states. Its start state, accept states, and transition function are defined to carry out the simulation.
4.23 The language of all strings with an equal number of Os and Is is a context-free language, generated by the grammar S —> 1S0S | OSlS | e. Let P be the PDA that recognizes this language. Build a TM M for BALdfa, which operates as follows. On input (B), where B is a DFA, use B and P to construct a new PDA R that recognizes the intersection of the languages of B and P. Then test whether R's language is empty. If its language is empty, reject; otherwise, accept.