INllHHXXTION AL'lttMATA TIH-'OKY,
Kan k ImnncK
Kmn D
bl
I
viy
INTRODUCTION TO
AUTOMATA THEORY, LANGUAGES,
COMPUTATION JOHN
E.
HOPCROFT
Cornell University
JEFFREY
D.
ULLMAN
Princeton University
AD DI SON-WESLEY PUBLISHING COMPANY Reading, Massachusetts
London
•
Amsterdam
•
Don
Menlo Park,
California
Mills, Ontario
•
Sydney
This book
is in
the
ADDISON -WESLEY SERIES IN COMPUTER SCIENCE Michael A. Harrison, Consulting Editor
Library of Congress Cataloging
in
Publication Data
Hopcroft, John E. 1939Introduction to automata theory, languages, and computation. ,
Bibliography: p. Includes index. 1. Machine theory. 2. Formal languages. Computational complexity. I. Ullman, 319^2Jeffrey D. joint author. II. Title. 629.8'312 78-67950 QA267.H56 ISBN 0-201-02988-X ,
Copyright (O 1979 by
Addison-Wesley Publishing Company,
Inc.
Philippines copyright 1979 by
Addison-Wesley Publishing Company, All rights reserved.
No
Inc.
part of this publication
may
be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photoc6pying, recording, or
otherwise, without the prior written permission of
the publisher. Printed in the United States of
America. Published simultaneously
in
Canada.
Library of Congress Catalog Card No. 78-67950.
ISBN: 0-201-02988-X
LMNOPQ-DO-89876
PREFACE
Ten years ago the authors undertook to produce a book covering the known material on formal languages, automata theory, and computational complexity. In retrospect, only a few significant results were overlooked in the 237 pages. In writing a new book on the subject, we find the field has expanded in so many new directions that a uniform comprehensive coverage
is
impossible. Rather than attempt to be encyclopedic,
we have been
brutal in our editing of the material, selecting only topics central to the theoretical
development of the
Over the past
field
or with importance to engineering applications.
ten years
two directions of research have been of paramount im-
portance. First has been the use of language-theory concepts, such as nondeterminism and
bounds on the inherent complexity of certain Second has been the application of language-theory ideas, such as regular expressions and context-free grammars, in the design of software, such as compilers and text processors. Both of these developments have helped shape the organization of the complexity hierarchies, to prove lower
practical problems.
this
book.
USE OF THE BOOK Both authors have used Chapters 1 through 8 for a senior-level course, omitting only the material on inherent ambiguity in Chapter 4 and portions of Chapter 8. Chapters 7, 8, 12, and 13 form the nucleus of a course on computational complexity. An advanced course on language theory could be built around Chapters 2 through 7, 9 through 11, and 14.
EXERCISES
We
use the convention that the most difficult problems are doubly starred, and problems
of intermediate difficulty are identified by a single star. Exercises
v
marked with an S have
VI
PREFACE
solutions at the end of the chapter.
We
have not attempted to provide a solution manual,
but have selected a few exercises whose solutions are particularly instructive.
ACKNOWLEDGMENTS We
would
like to
thank the following people
comments and advice: Dave Maier, Fred Springsteel,
for their perceptive
Al Aho, Nissim Francez, Jon Goldstine, Juris Hartmanis,
and Jacobo Valdes. The manuscript was expertly typed by Marie Olton and April Roberts at Cornell and Gerree Pecht at Princeton. Ithaca,
New York New Jersey
Princeton,
March 1979
J.
E. H.
J.
D. U.
CONTENTS
Chapter
1
Preliminaries
1.1
Strings, alphabets,
1.2
Graphs and
and languages
1
trees
2
1.3
Inductive proofs
4
1.4
Set notation
5
1.5
Relations
6
1.6
Synopsis of the book
8
Chapter 2
Finite
Automata and Regular Expressions
2.1
Finite state systems
13
2.2
Basic definitions
16
2.3
Nondeterministic
automata Finite automata with (-moves
24
2.5
Regular expressions
28
2.6
Two-way
2.4
2.7 2.8
Chapter 3
19
finite
automata Finite automata with output Applications of finite automata
36
finite
42 45
Properties of Regular Sets
3.1
The pumping lemma
3.2
Closure properties of regular sets
for regular sets
3.3
Decision algorithms for regular sets
3.4
The Myhill-Nerode theorem and minimization of finite automata
vii
55
58 63 .
.
65
Vlii
CONTENTS
Chapter 4
Context-Free
Grammars
4.1
Motivation and introduction
4.2
Context-free
4.3
Derivation trees
4.4
Simplification of context-free
4.5
Chomsky normal form
77
grammars
79 82
4.6
Greibach normal form
4.7
The
grammars
87
92
94
existence of inherently
ambiguous
context-free languages
...
99
Chapter 5
Pushdown Automata
5.1
Informal description
107
5.2
Definitions
108
5.3
Pushdown automata and
Chapter 6
context-free languages
114
Properties of Context-Free Languages
6.1
The pumping lemma
6.2
Closure properties of CFL's
130
6.3
Decision algorithms for CFL's
137
Chapter 7
for
CFL's
125
Turing Machines
7.1
Introduction
146
7.2
The Turing machine model Computable languages and functions Techniques for Turing machine construction
147
159
7.3
7.4
150 153
7.5
Modifications of Turing machines
7.6
Church's hypothesis
166
7.7
Turing machines as enumerators
167
7.8
Restricted Turing machines equivalent to the basic
Chapter 8
model
170
Undecidability
8.1
Problems
8.2
Properties of recursive and recursively enumerable languages
8.3
Universal Turing machines and an undecidable problem
8.4
8.7
theorem and some more undecidable problems Undecidability of Post's correspondence problem Valid and invalid computations of TM's: a tool for proving CFL problems undecidable Greibach's theorem
8.8
Introduction to recursive function theory
207
8.9
Oracle computations
209
8.5
8.6
Chapter 9
177
Rice's
.
.
.
179 181
185
193
201
205
The Chomsky Hierarchy
9.1
Regular grammars
9.2
Unrestricted
grammars
217
220
CONTENTS
9.3
Context-sensitive languages
9.4
Relations between classes of languages
Chapter 10
Deterministic Context-Free Languages
223 227
DPDA's
10.1
Normal forms
10.2
Closure of DCFL's under complementation
for
10.3
Predicting machines
10.4
Additional closure properties of
10.5
Decision properties of
10.6
LR(0) grammars LR(0) grammars and DPDA's LR(k) grammars
10.7 10.8
Chapter 11
234 235 240 243 246 248 252 260
DCFL's
DCFL's
Closure Properties of Families of Languages
11.1
Trios and
11.2
Generalized sequential machine mappings
11.3
Other closure properties of trios
270 272 276 277 279
full trios
11.4
Abstract families of languages
11.5
Independence of the
11.6
Summary
Chapter 12
AFL operations
279
Computational Complexity Theory
12.1
Definitions
12.2
Linear speed-up, tape compression, and reductions
285 in the
number
of tapes
288
12.3
Hierarchy theorems
12.4
Relations
12.5
Translational lemmas and nondeterministic hierarchies
12.6
Properties of general complexity measures: the gap, speedup,
12.7
and union theorems Axiomatic complexity theory
Chapter 13
IX
among
295
complexity measures
300 302
306 312
Intractable Problems
13.1
Polynomial time and space
320
13.2
Some NP-complete problems
324
13.3
341
13.5
The class co-./T^ PSPACE-complete problems Complete problems for & and NSPACE(log
13.6
Some provably
13.7
The
13.4
0>
limits
Chapter 14
—
intractable problems
jV'i? question for
on our
343 n)
ability to tell
347
350
Turing machines with oracles:
whether
&=
c
\'d?
362
Highlights of Other Important Language Classes
pushdown automata
14.1
Auxiliary
14.2
Stack automata
377 381
X
CON ltNlb
14.3
Indexed languages
389
14.4
Developmental systems
390
Bibliography
396
Index
411
CHAPTER
1
PRELIMINARIES
In this chapter
we survey
the principal mathematical ideas necessary for under-
standing the material in this book. These concepts include graphs, trees,
sets,
and mathematical induction. We also provide a brief introduction to, and motivation for, the entire work. The reader with a background in the mathematical subjects mentioned can skip to Section 1.6 for relations, strings, abstract languages,
motivational remarks.
1.1
STRINGS, ALPHABETS,
AND LANGUAGES
A "symbol" is an abstract entity that we shall not define formally, just as "point" and "line" are not defined in geometry. Letters and digits are examples of frequently used symbols. A string (or word) is a finite sequence of symbols juxtaposed. For example, a, b, and c are symbols and abcb is a string. The length of a string w, denoted w is the number of symbols composing the string. For example, abcb has length 4. The empty string, denoted by £, is the string consisting of |
zero symbols.
A
Thus
,
|
\e
\
=
0.
any number of leading symbols of that string, and a any number of trailing symbols. For example, string abc has prefixes £, a, ab, and abc; its suffixes are £, c, be, and abc. A prefix or suffix of a string, other than the prefix of a string
is
suffix is
string
itself, is
called a proper prefix or suffix.
The concatenation of two strings is the string formed by writing the first, followed by the second, with no intervening space. For example, the concatenation of
dog and house is doghouse. Juxtaposition is used as the concatenation is, if w and x are strings, then wx is the concatenation of these two
operator. That
1
PRELIMINARIES
2
The empty
strings.
=
£w
we
—w
string
is
the identity for the concatenation operator. That
is,
for each string w.
An alphabet is a finite set of symbols. A (formal) language is a set of strings of symbols from some one alphabet. The empty set, 0, and the set consisting of the empty string {e} are languages. Note that they are distinct; the latter has a member while the former does not. The set of palindromes (strings that read the same forward and backward) over the alphabet {0, 1} is an infinite language. Some members of this language are e, 0, 1, 00, 11, 010, and 1101011. Note that the set of palindromes over an
all
because
its
Another language is the set of all language by Z*. For example, if
this
=
Z=
{0, 1},
1.2
GRAPHS AND TREES
A
then
Z*
graph, denoted
set
G=
{e, 0,
1,
Z=
00, 01, 10,
(V, E), consists of
a
{a},
000,
1 1,
.
.
=
Z*
then
{e, a,
aa, aaa,
{1, 2, 3, 4, 5}
finite set
of vertices (or nodes)
E called edges. An example graph is shown in Fig. and E = {(n, m) n + m = 4 or n + m = 7}.
A path example, path
is
in
a graph
is
(v h v i+1 ) for
1, 3,
4
is
If
V and 1.1.
a
Here
|
Example of a graph.
Fig. 1.1
an edge
.}.
. .
.}.
of pairs of vertices
V=
is
of symbols
is technically not a language from an alphabet. strings over a fixed alphabet Z. We denote
infinite collection
strings are not collectively built
a sequence of vertices v l9 v 2 each
i,
1
< < i
k.
The
a path in the graph of Fig.
,
. .
.
vk , k
,
>
1,
such that there
length of the path
1.1;
so
is
5
by
is
k
itself. If
vx
— =
1.
For
v k9 the
a cycle.
Directed graphs
G = (V, E), consists of a finite set of V and a set of ordered pairs of vertices E called arcs. We denote an arc from v to w by v -> w. An example of a digraph appears in Fig. 1.2. A path in a digraph is a sequence of vertices v l9 v 2 vk k > 1, such that is an arc for each 1 < < k. We say the path is from v to vk Thus Vi -> v i+ l-+2->3->4isa path from 1 to 4 in the digraph of Fig. 1.2. If v -> w is an arc we say is a predecessor of w and w is a successor of u.
A
directed graph (or digraph), also denoted
vertices
,
j
i?
i,
i
,
x
.
1.2
Fig. 1.2
The digraph
|
GRAPHS AND TREES
J
-+j\i< ;}).
({1, 2, 3, 4}, {i
Trees
A
tree (strictly speaking,
an ordered, directed
tree)
a digraph with the following
is
properties. 1)
There there
is
one vertex, called the
has no predecessors and from which
root, that
a path to every vertex.
is
2)
Each vertex other than the root has
3)
The successors of each
exactly one predecessor.
vertex are ordered "from the
left."
We shall draw trees with the root at the top and all arcs pointing downward. The arrows on the arcs are therefore not needed to indicate direction, and they will not be shown. The successors of each vertex will be drawn in left-to-right shows an example of a
order. Figure 1.3
sentence "The quick
named
in this
brown
fox
tree
which
jumped over
the "diagram" of the English
is
the lazy dog."
The
vertices are not
example, but are given "labels," which are either words or parts of
speech.
\
/
/
v
< adject ivc>
phrase>
1
jumped
the
\
/ \\
/
/
I
/
quick
< adject ive>
phrase>
/ \
I
I
brown
I
I
lox
lazy
I
dog Fig. 1.3
A
tree.
PRELIMINARIES
4
There
ogy sor
a special terminology for trees that differs from the general terminol-
is
A
for arbitrary graphs.
successor of a vertex
called the father. If there
is
to be an ancestor of v 2 and v 2
is
,
vx
=
v2
no sons
is
For example,
vertices.
said to be a descendant
not ruled out; any vertex
is
vertex with
oiv^ Note
(verb)
said
itself.
A
interior
a son of the vertex
is
the father of the former.
is
is
that the case
The vertex
labeled
the vertex labeled (verb phrase), the vertex labeled
itself,
(sentence), and six other vertices.
The
by English words are the
vertices labeled
and those labeled by parts of speech enclosed
leaves,
and the predeces-
an ancestor and a descendant of and the other vertices are called
in Fig. 1.3, the vertex labeled
a descendant of
is
is
called a leaf
labeled (verb phrase), and the latter
"dog"
called a son,
is
a path from vertex v x to vertex v 2 , then v x
is
in angle brackets are the
interior vertices.
1.3
INDUCTIVE PROOFS
Many theorems in this book are proved by mathematical induction. Suppose we have a statement P(n) about a nonnegative integer n. A commonly chosen example is to take P(n) to be n
2
I The
<
=
n(n+
1)
(1.1)
6
principle of mathematical induction
is
that P(n) follows from
and
a) P(0),
b) P(n
+
l)(2n
-
Condition
1)
>
implies P(n) for n
(a) in
1.
an inductive proof
the inductive step.
The
is
called the basis,
left-hand side of (b), that
P(n
is
and condition
—
1), is
(b)
is
called
called the inductive
hypothesis.
Example
We establish (a) by and observing that both sides are 0. To prove (b), we (1.1) and try to prove (1.1) from the result. That is, we
Let us prove (1.1) by mathematical induction.
1.1
substituting 0 for n in (1.1)
—
substitute n
must show n
—
for n in
1
for n
>
1
that
(n- i)n(2n-
n
1)
implies i
=0
6
«
Since n
1
and since we are given
=0
n
1
i
=0
£ =
o
2 i
=
n(n+ l)(2n+ 6
1)
1.4
we need only show
-
-
l)n(2n
1)
+
6
The
latter equality follows
1.4
SET NOTATION
We
assume that the reader
2
"
_ ~
n(n
+
l)(2n
5
+
1) *
6
from simple algebraic manipulation, proving
members between
alphabet of symbols 0 and
brackets.
We
1.
(1.1).
familiar with the notion of a set, a collection of
is
objects (members of the set) without repetition. Finite sets 4
NOTATION
that (n
listing their
SET
|
may
For example we used
be specified by
{0, 1}
to denote the
also specify sets by a set former:
{x\P(x)},
(1.2)
{xin^l|P(x)}.
(1.3)
or
Statement (1.2) is read "the set of objects x such that P(x) is true," where P(x) is some statement about objects x. Statement (1.3) is "the set of x in set A such that P(x) is true," and is equivalent to {x|P(x) and x is in A}. For example, {/
is
a
way
1
i
is
an integer and there
exists integer j
such that
i
=
2j]
of specifying the even integers.
member of A is a member of B, then we write A^ B and say A is A 3 B is synonymous with B £ A. If A £ B but A ^ B that is, every member of /I is in B and there is some member of B that is not in A, then we write A £ B. Sets /I and £ are equal if they have the same members. That is, A = B if and only if A ^ B and B ^ A. If
every
contained in B.
y
Operations on sets
The usual operations defined on 1)
A u
B, the
wmcw
of
/I
and {x
2)
A n
B, the intersection of
{x 3)
A —
B, the difference of
/I
x
B,
is
in
A
or x
^ and
£,
is
x
x
|
/I
{x |x 4)
|
sets are:
is
is
and is
in
in
5) 2^, the
is
in
A and
power
5
set of
is
^ and x
5,
is
in B}.
in #}.
is
A and x
B, the Cartesian product of
that a
is
A and
is
£,
not is
in B}.
the set of ordered pairs
in B.
>4, is
the set of
all
subsets of A.
(a,
b) such
/ PRELIMINARIES
6
Example
1.2
A=
Let
{1, 2}
A u £=
and
B= /I
{1,2, 3},
,4xB =
Then
{2, 3}.
n B=
A-B =
{2},
{1},
{(l, 2),(1, 3), (2, 2), (2, 3)},
and 2"
= {0,{1},
{2},{1, 2}}.
Note that if A and B have n and members and 2 A has 2" members.
m
members,
A
respectively, then
x
B
has
wn
Infinite sets
Our intuition when extended to infinite sets can be misleading. Two sets S x and 5 2 have the same cardinality (number of members) if there is a one-to-one mapping of the elements of S onto S 2 For finite sets, if S x is a proper subset of S 2 tnen $i and S 2 have different cardinality. However, if S and S 2 are infinite, the latter statement may be false. Let S, be the set of even integers and let S 2 be the set of all integers. Clearly 5! is a proper subset of S 2 However, S t and S 2 have the same .
{
,
x
.
cardinality, since the function / defined
by
(2i)
=
i
mapping of the
a one-to-one
is
even integers onto the integers.
Not and the
all infinite sets
set of all reals.
have the same cardinality. Consider the set of all integers that the set of reals can be put in one-to-one-onto
Assume
correspondence with the integers. Then consider the real number whose after the
decimal
cannot be
in
correspondence with any integer, since
mapped
has been
the ith digit of the ith real plus 5
is
to an integer.
From
this
it
mod
10.
ith digit
This real number
diners from every real that
we conclude
that the reals cannot be
placed in one-to-one correspondence with the integers. Intuitively there are too
many
real
numbers to do
The above construction
so.
is
called diagonalization
and
an important tool in computer science. Sets that can be placed in one-to-one correspondence with the integers are said to be countably infinite or countable. The rationals and the set Z* of the finite-length strings from an alphabet X are countably infinite. The set of all is
subsets of
same
the set of
all
functions
mapping
the integers to
(binary) relation
is
a set of pairs.
The first component of each
pair
a set called the domain, and the second component of each pair (possibly different) set called the range.
the
R
{0, 1}
are of the
and are not countable.
RELATIONS
1.5
A
X* and
cardinality as the reals,
domain and range are the same
is
a relation and
(a,
b)
is
We shall use primarily
set S. In that case
a pair in R, then
we
we say
is
is
chosen from
chosen from a
relations in
the relation
often write aRb.
is
which on
S. If
1.5
|
RELATIONS
7
Properties of relations
We
aRa
1) reflexive if
2) irreflexive 3) transitive
if
if
4) symmetric
Note
< <
b
for all
aRa
is
S
is
am S;
false for all
a
in 5;
a&fr implies bRa;
if
aKb
The
1.3
c implies a
a
set
Wfa
implies that
is false.
any asymmetric relation must be
that
Example
on
aRb and bRc imply oRc;
if
5) asymmetric
b
R
say a relation
<
relation
c. It is
<
on the
set
irreflexive.
of integers
asymmetric and hence
is
transitive
irreflexive
because a < b and < b implies
because a
is false.
Equivalence relations
A
relation
relation.
partitions solution).
1)
R
An
that
is
reflexive,
symmetric, and transitive
is
an equivalence set S is that R Exercise 1.8 and its
said to be
important property of an equivalence relation
R
on a
S into disjoint nonempty equivalence classes (see That is, S = 5j u S 2 u •, where for each i and /, with •
i
# j:
n Sj=0;
Si
2) for each a 3) for
each a
The S^s are
and b in
:n
S h aRb
S and b t
in
is
true;
S p aRb
is false.
Note
called eauivalence classes.
that the
number of
classes
may be
infinite.
Example 1.4 A common example of an equivalence relation is congruence modulo an integer m. We write = m j or i = j mod m if and j are integers such that i — j is divisible by m. The reader may easily prove that = m is reflexive, transitive, and symmetric. The equivalence classes of = m are m in number: i
i
{.
{•
.
(m
1,
.
,
-
— m, 1),
m—
1,
0,
m, 2m,
.
.
.},
m+
1,
2m +
1,
2m —
1,
3m —
1,
1,
.
.
•}•
PRELIMINARIES
8
Closures of relations
&
Suppose
is
The ^-closure of a
a set of properties of relations.
smallest relation R' that includes all the pairs of
For example, the 1) If (a, b) is in
K and (5, + unless in R
2) If (a, b) is in 3)
Nothing
is
(a,
+
R+
b)
is
in
c)
is
in R,
it
relation
R
is
the
and possesses the properties
denoted
transitive closure of R,
R, then
R
K+
,
in
defined by:
is
.
(a, c) is in
R+
so follows from (1) and
(2).
then
.
+
in R by rules (1) and (2) belongs would either not include R or not be transitive. Also an easy + + inductive proof shows that R is in fact transitive. Thus R includes R, is transitive, and contains as few pairs as any relation that includes R and is transitive. The reflexive and transitive closure of R, denoted R* is easily seen to be It
should be evident that any pair placed
there, else
R+
9
R+ u
{(a,
Example
a)\a
1.5
is
in S}.
Let
R=
{(1, 2), (2, 2), (2, 3)}
R+={(1,2),
(2, 2),
be a relation on the
(2,3),(1,
set {1, 2, 3}.
Then
3)},
and
**
1.6
=
{(!, 1), (1,2), (1,3), (2,2), (2, 3), (3, 3)}.
SYNOPSIS OF THE BOOK
Computer tion. Its
science
is
the systematized
body of knowledge concerning computa-
beginnings can be traced back to the design of algorithms by Euclid and
the use of asymptotic complexity
and
reducibility
by the Babylonians (Hogben
Modern interest, however, is shaped by two important events: the advent modern digital computers capable of many millions of operations per second
[1955]).
of
and the formalization of the concept of an
effective procedure, with the con-
sequence that there are provably noncomputable functions.
Computer science has two major components: first, the fundamental ideas and models underlying computing, and second, engineering techniques for the design of computing systems, both hardware and software, especially the application of theory to design. This book is intended as an introduction to the first area, the fundamental ideas underlying computing, although we shall remark briefly on the most important applications. Theoretical computer science had its beginnings in a number of diverse fields: biologists studying models for neuron nets, electrical engineers developing switching theory as a tool to hardware design, mathematicians working on the foundations of logic, and linguists investigating grammars for natural languages. Out of these studies came models that are central to theoretical computer science.
1.6
SYNOPSIS OF THE BOOK
|
9
The notions of finite automata and regular expressions (Chapters 2 and 3) were originally developed with neuron nets and switching circuits in mind. More recently, they have served as useful tools in the design of lexical analyzers, the part of a compiler that groups characters into tokens indivisible units such as variable names and keywords. A number of compiler-writing systems automatically
—
transform regular expressions into
number of other
automata
finite
for use as lexical analyzers.
A
uses for regular expressions and finite automata have been found
in text editors, pattern matching, various text-processing and file-searching programs, and as mathematical concepts with application to other areas, such as logic.
At the end of Chapter 2 we
shall outline
some of
the applications of this
theory.
The notion of a context-free grammar and the corresponding pushdown automaton (Chapters 4 through 6) has aided immensely the specification of programming languages and in the design of parsers another key portion of a compiler. Formal specifications of programming languages have replaced extensive and often incomplete or ambiguous descriptions of languages. Understanding the capabilities of the pushdown automaton has greatly simplified parsing. It is inter-
—
esting to observe that parser design was, for the earliest compilers, a difficult
many of the early parsers were quite Now, thanks to widespread knowledge
problem, and
inefficient
restrictive.
of a variety of context-free-
grammar-based techniques, parser design
is
occupies only a few percent of the time spent
we
and unnecessarily
no longer a problem, and parsing
in typical
compilation. In Chapter 10
sketch the principal ways in which efficient parsers that behave as
pushdown
automata can be built from certain kinds of context-free grammars. In Chapter 7 we meet Turing machines and confront one of the fundamental problems of computer science; namely, that there are more functions than there are names for functions or than there are algorithms for computing functions. Thus we are faced with the existence of functions that are simply not computable; that is, there is no computer program that can ever be written, which given an argument for the function produces the value of the function for that argument and works for all possible arguments. Assume that for each computable function there is a computer program or algorithm that computes it, and assume that any computer program or algorithm can be finitely specified. Thus computer programs are no more than finite-length strings of symbols over some finite alphabet. Hence the set of all computer programs is countably infinite. Consider now functions mapping the integers to 0 and 1. Assume that the set of all such functions are countably infinite and that these functions have been placed in correspondence with the integers. Let f be the t
function corresponding to the fth integer.
/(«)
=
(0
(
1
Then
the function
if/.(n)=l otherwise
/
/
/
PRELIMINARIES
10
cannot correspond to any integer, which is a contradiction. [If (n) have the contradiction (j) = fj(j) and (J) ^ fj(j).] This argument Chapters 7 and
8,
where we
shall see that certain easily stated
solved on the computer, even though they appear at
first
= fj(n), then we is
formalized in
problems cannot be
glance to be amenable to
computation.
However, we can do more than
tell
whether a problem can be solved by a
computer. Just because a problem can be solved doesn't algorithm to solve
it.
In Chapter 12
we
mean
there
is
a practical
see that there are abstract problems that
are solvable by computer but require inordinate amounts of time and/or space for their solution.
Then
in
Chapter 13 we discover that there are many
important problems that also able problems"
is
realistic and The nascent theory of "intractprofoundly how we think about problems.
fall in this
destined to influence
category.
EXERCISES In the tree of Fig.
1.1
1.4,
Which vertices are leaves and which are interior b) Which vertices are the sons of 5? c) Which vertex is the father of 5? d) What is the length of the path from 1 to 9? e) Which vertex is the root? a)
A
Fig. 1.4
Prove by induction on n that
1.2
"
.
a)
n(n
+
1)
= ^-y-1 I £ =0 .
1
i
b) i
v I =0
3 1
= /£ E \i
=0
1
Y /
tree.
vertices?
EXERCISES
*S
A
1.3
11
palindrome can be defined as a string that reads the same forward and backward,
or by the following definition. a palindrome.
1) £ is
any symbol, then the string a is a palindrome. any symbol and x is a palindrome, then axa is a palindrome. 4) Nothing is a palindrome unless it follows from (1) through (3).
2) If a
is
3) If a
is
Prove by induction that the two definitions are equivalent.
The
* 1.4 1)
A
strings of balanced parentheses can be defined in at least
string
w
a)
w
over alphabet
has an equal
b) any prefix of 2) a)
e is
b) If c) If
w
{(,)} is
number has at
balanced
of fs and
least as
)'s,
many
and only and if
fs as
two ways.
if:
)'s.
balanced.
w is a balanced string, then (w) is balanced. w and x are balanced strings, then so is wx.
d) Nothing else
is
a balanced string.
Prove by induction on the length of a string that definitions
(1)
and
(2) define the
same
class
of strings.
What
* 1.5
wrong with
all elements in any set must one element the statement is trivially true. Assume the statement is true for sets with n — 1 elements, and consider a set S with n elements. Let a be an element of 5. Write S = S u 5 2 where S and S 2 each have n — 1 elements, and each contains a. By the inductive hypothesis all elements in S are identical to a and similarly all elements in S 2 are identical to a. Thus all elements in S are identical to a. is
be identical? For
the following inductive "proof" that
sets with
,
{
{
x
Show
1.6
a)
b)
The The
that the following are equivalence relations
and give
their equivalence classes.
Ri on integers defined by iRij if and only if i = j. relation R 2 on people defined by pR 2 q if and only if p and q were born relation
at the
same
hour of the same day of some year. c)
The same
1.7
as (b) but "of the
same year"
instead of "of
some
year."
Find the transitive closure, the reflexive and transitive closure, and the symmetric
closure of the relation
2), (2, 3), (3, 4), (5, 4)}.
*S 1.8
Prove that any equivalence relation
R on
a set S partitions S into disjoint equiv-
alence classes. * 1.9
[Hint:
Give an example of a relation that is symmetric and transitive but not reflexive. Note where reflexivity is needed to show that an equivalence relation defines equiv-
alence classes; see the solution to Exercise * 1.10
Prove that any subset of a countably
* 1.11
Prove that the
1.12
Is the
infinite?
is
set
of
all
1.8.]
infinite set
is
ordered pairs of integers
union of a countably
the Cartesian product?
infinite collection of
either finite or countably infinite. is
countably
countably
infinite.
infinite sets
countably
PRELIMINARIES
12
Solutions to Selected Exercises
Clearly every string satisfying the second definition reads the
1.3
same forward and
We prove by induction on the
backward. Suppose x reads the same forward and backward.
length of x that x's being a palindrome follows from rules (1) through (3). If x < 1, then x is either e or a single symbol a and rule (1) or (2) applies. If x > 1, then x begins and ends |
|
\
|
some symbol a. Thus x = awa, where w reads the same forward and backward and is By the induction hypothesis, rules (1) through (3) imply that w is a palindrome. Thus by rule (3), x = awa is a palindrome. with
shorter than x.
Let R be an equivalence relation on S, and suppose a and b are elements of S. Let C a C b be the equivalence classes containing a and b respectively; that is, C a = {c \aRc] and C b = {c bRc}. We shall show that either C a — C b or C a n C b = 0. Suppose C a n C b 0; let be in C a n C b Now let e be an arbitrary member of C a Thus aRe. As d is in Ca n C 1.8
and
|
.
.
and bRd. By symmetry, By transitivity and hence C ^ C b A similar proof shows that C b
we have
Cb
.
fl
equivalence classes are disjoint.
To show
observe that by reflexivity, each a equivalence classes
is
S.
is
in
fc
(twice),
G C
fl ,
bRa and bRe. Thus
so
C a = C b Thus .
that the classes form a partition,
the equivalence class
C
fl ,
e
is
in
distinct
we have only
to
so the union of the
CHAPTER
2
AUTOMATA AND REGULAR EXPRESSIONS
FINITE
2.1
FINITE STATE SYSTEMS
The
finite automaton is a mathematical model of a system, with discrete inputs and outputs. The system can be in any one of a finite number of internal configurations or "states." The state of the system summarizes the information concerning past inputs that is needed to determine the behavior of the system on subsequent inputs. The control mechanism of an elevator is a good example of a finite state system. That mechanism does not remember all previous requests for service but only the current floor, the direction of motion (up or down), and the
collection of not yet satisfied requests for service.
In computer science
we
theory of finite automata
example
is
is
find
many examples
of finite state systems, and the
a useful design tool for these systems.
a switching circuit, such as the control unit of a computer.
composed of a
number of gates, each
A primary A switching
of which can be in one of two These conditions might, in electronic terms, be two different voltage levels at the gate output. The state of a switching network with n gates is thus any one of the 2" assignments of 0 or 1 to the various gates. Although the voltage on each gate can assume any of an infinite set of values, the electronic circuitry is so designed that only the two voltages corresponding to 0 and 1 are stable, and other voltages will almost instantaneously adjust themselves to one of these voltages. Switching circuits are intentionally designed in this way, so that they can be viewed as finite state systems, thereby separating the logical design of a computer from the electronic implementation. Certain commonly used programs such as text editors and the lexical analyzers found in most compilers are often designed as finite state systems. For examcircuit
is
finite
conditions, usually denoted 0 and
1.
13
FINITE
14
pie,
AUTOMATA AND REGULAR EXPRESSIONS
a lexical analyzer scans the symbols of a computer program to locate the
strings of characters corresponding to identifiers, numerical constants, reserved this process the lexical analyzer needs to remember only a amount of information, such as how long a prefix of a reserved word it has seen since startup. The theory of finite automata is used heavily in the design of efficient string processors of these and other sorts. We mention some of these
words, and so on. In
finite
applications in Section 2.8.
The computer
itself
can be viewed as a
turns out not to be as useful as one
finite state
would
like.
system, although doing so
Theoretically the state of the
main memory, and auxiliary storage at any time is one of a very number of states. We are assuming of course that there is some
central processor, large but finite fixed
number of
disks,
cannot extend the system, however, artificial limit
not satisfying mathematically or realistically.
on the memory capacity, thereby
of computation. tially infinite
is
drums, tapes, and so on available for use, and that one indefinitely. Viewing a computer as a finite state
memory
To
failing to
properly capture the notion of computation
memory, even though each computer
It
places an
capture the real essence
installation
we need
a poten-
is finite.
Infinite
models of computers will be discussed in Chapters 7 and 8. It is also tempting to view the human brain as a finite state system. The number of brain cells or neurons is limited, probably 2 35 at most. It is conceivable, although there is evidence to the contrary, that the state of each neuron can be described by a small number of bits. If so, then finite state theory applies to the brain. However, the number of states is so large that this approach is unlikely to result in useful observations about the human brain, any more than finite state assumptions help us understand large but finite computer systems. Perhaps the most important reason for the study of finite state systems is the naturalness of the concept as indicated by the fact that it arises in many diverse places. This is an indication that we have captured the notion of a fundamental class of systems, a class that is rich in structure and potential application.
An example Before formally defining
finite state
systems
let
us consider an example.
A man
on the left bank of a river. There is a boat large enough to carry the man and only one of the other three. The man and his entourage wish to cross to the right bank, and the man can ferry each across, one at a time. However, if the man leaves the wolf and goat unattended on either shore, the wolf will surely eat the goat. Similarly, if the goat and cabbage are left with a wolf, goat, and cabbage
is
unattended, the goat will eat the cabbage. Is
it
possible to cross the river without
the goat or cabbage being eaten ?
The problem
is
modeled by observing
that the pertinent information
is
the
occupants of each bank after a crossing. There are 16 subsets of the man (M), wolf (W), goat (G), and cabbage (C). A state corresponds to the subset that is on the left
2.1
FINITE STATE SYSTEMS
|
15
bank. States are labeled by hyphenated pairs such as MG-WC, where the symbols to the left of the hyphen denote the subset on the left bank; symbols to the right of the
hyphen denote the subset on the right bank. Some of the 16 states, such as are fatal and may never be entered by the system. The "inputs" to the system are the actions the man takes. He may cross alone
GC-MW,
(input m), with the wolf (input w), the goat (input initial state is is
shown
g),
or cabbage (input
c).
The
MWGC-0 and the final state 0-MWGC. The transition diagram is
in Fig. 2.1.
MWGC-0X
—
*CwC-MG g
MWC-G
^~
W-MGC
C-MWG
g 1
g\
I
MGC-W
MWG-C
G-MWC
Fig. 2.1
Transition diagram for man, wolf, goat, and cabbage problem.
There are two equally short solutions to the problem, as can be seen by initial state to the final state (which is doubly circled). There are infinitely many different solutions to the problem, all but two involving useless cycles. The finite state system can be viewed as defining an infinite language, the set of all strings that are labels of paths from the start state to the final searching for paths from the
state.
Before proceeding, we should note that there are at least two important ways which the above example is atypical of finite state systems. First, there is only one final state; in general there may be many. Second, it happens that for each transition there is a reverse transition on the same symbol, which need not be the case in general. Also, note that the term "final state," although traditional, does not mean that the computation need halt when it is reached. We may continue in
making
transitions, e.g., to state
MG-WC in
the above example.
FINITE
16
AUTOMATA AND REGULAR EXPRESSIONS
BASIC DEFINITIONS
2.2
A finite automaton (FA) consists of a finite set of states and a set of transitions from state to state that occur
input symbol there state
ton
is
on input symbols chosen from an alphabet X. For each
exactly one transition out of each state (possibly back to the
One state, usually denoted g 0 Some states are designated as
itself).
starts.
,
is
the initial state, in which the automa-
final or accepting states.
is associated with an FA as graph correspond to the states of the FA. If there is a transition from state q to state p on input a, then there is an arc labeled a from state q to state p in the transition diagram. The FA accepts a string x if the sequence of transitions corresponding to the symbols of x leads from the start
A
directed graph, called a transition diagram,
follows.
The
state to
an accepting
Example
2.1
vertices of the
state.
The transition diagram of an
FA is illustrated in Fig. 2.2. The initial
arrow labeled "start." There is one final state, also q 0 q0 in this case, indicated by the double circle. The FA accepts all strings of O's and l's in which both the number of O's and the number of l's are even. To see this, visualize "control" as traveling from state to state in the diagram. Control starts at q 0 and must finish at q 0 if the input sequence is to be accepted. Each 0-input causes control to cross the horizontal line a-b, while a 1-input does not. Thus control is at a state above the line a-b if and only if the input seen so far contains state,
,
is
indicated by the
an even number of O's. Similarly, control
is
at a state to the
number of
left
of the vertical line
Thus control is at q 0 if and only if there are both an even number of O's and an even number of l's in the input. Note that the FA uses its state to record only the parity of the number of O's and the number of l's, not the actual numbers, which would require an infinite number of states.
c-d
if
and only
if
the input contains an even
l's.
b
Fig. 2.2
The
transition diagi
am
of a
finite
automaton.
2.2
|
BASIC DEFINITIONS
We formally denote a finite automaton by a 5-tuple a
finite set
of states,
Z
the set of ^na/ states,
a
is
17
(Q, Z, g 0 ^)» where Q is q 0 in Q is the imtt'a/ state, F ^ Q is the transition function mapping Q x £ to Q. That is, >
finite inpur alphabet,
and 3
is
a state for each state q and input symbol a. We picture an FA as a finite control, which is in some state from Q, reading a sequence of symbols from Z written on a tape as shown in Fig. 2.3. In one move a)
is
FA in state q and scanning symbol a enters state S(q, a) and moves its head one symbol to the right. If S(q, a) is an accepting state, then the FA is deemed to have accepted the string written on its input tape up to, but not including, the position to which the head has just moved. If the head has moved off the right end of the tape, then it accepts the entire tape. Note that as an FA scans a string it may the
accept
many
different prefixes.
0
0
0
1
1
1
0
1
J
Finite
control
Fig. 2.3
To
A
finite
automaton.
formally describe the behavior of an
transition function
We
<5
FA
and a
to apply to a state
on a
string,
we must extend
string rather than a state
the
and a
x Z* to Q. The intention is that 3{q, w) is w starting in state q. Put another way, 3(q, w) is the unique state p such that there is a path in the transition diagram from q to p, labeled w. Formally we define
symbol.
define a function 3 from
the state the
1) 3(q, c)
=
FA
q,
will
be
in after
Q
reading
and
2) for all strings
w and
input symbols 3(q,
wa)
=
a,
3(3(q, w), a).
(1) states that without reading an input symbol the FA cannot change state, and (2) tells us how to find the state after reading a nonempty input string wa. That is, find the state, p = 5(q, w), after reading w. Then compute the state 5(p, a).
Thus
Since 5(q, a) = 3(S(q, c), a) = S(q, a) [letting w = c in rule (2) above], there can be no disagreement between 3 and S on arguments for which both are defined. Thus we shall for convenience use 3 instead of 3 from here on.
Convention We shall strive to use the same symbols to mean the same thing throughout the material on finite automata. In particular, unless it is stated other-
AUTOMATA AND REGULAR EXPRESSIONS
FINITE
18
wise, the reader 1)
2)
may assume:
Q
is
is
the initial state.
Z
is
a set of states. Symbols q and p, with or without subscripts, are states. q 0
an input alphabet. Symbols a and b with or without subscripts, and the y
digits are input symbols.
3) 3
4)
is
F
a transition function.
a set of final
is
5) w, x, y,
A $(qoy
and
string
x
=
f° r
x)
P
z,
is
states.
with or without subscripts, are strings of input symbols.
M
= (Q, Z, g 0 F) if by a finite automaton The language accepted by M, designated L(M ), is the
said to be accepted
some pin
A
set {x \S(q 0 , x) is in F).
F.
language
,
a regular set (or just regular)
is
if it is
the set
automaton.f The reader should note that when we talk about a language accepted by a finite automaton we are referring to the specific set L(M), not just any set of strings all of which happen to be accepted by accepted by some
finite
M
Example {0,
1},
.
Consider the transition diagram of Fig. 2.2 again. In our formal
2.2
M
= (Q, Z, S, q Qy denoted and 3 is shown in Fig. 2.4.
FA is F = {g 0
notation this
Z =
M
},
F),
Q=
where
{q 0 , q u q 2 , q 3 },
Inputs
0
StatesV
Suppose Thus
1
10101
is
qi
qi
qo
qi
qo
<73
43
qi
qi
M.
input to
tffao,
We
qo q\
<%
Fig. 2.4
might remark that thus >
a) for the
We note
H) = (5(<%»
continue by noting d(q 0 0)
11 is in
=
6{q 09 110)
q2
=
1
.
1), !)
FA
of Fig.
,
1)
=
1)
=
*0
that d(q 0
= <5(
2.2.
1)
=
go-
.
L(M), but we are interested
in 110101.
We
Thus
<*(<%>,
n),0)
=
3(q Oi 0)
The term "regular" comes from "regular expressions," a formalism we and which defines the same class of languages as the FA's.
t
q x and 5(q u
=
q2
.
shall introduce in Section 2.5,
2.3
Continuing
in this fashion,
5(q 0
and
,
we
NONDETERM IN ISTI C
|
FINITE
AUTOMATA
19
find that
1101)
=
5(q 0
^3 ,
,
11010)
=^
finally
% The
entire sequence of states
0,
H0101)
=
q0
.
is
qlq\q%\
is
the set of strings with an even
NONDETERMINISTIC FINITE AUTOMATA
2.3
We now
introduce the notion of a nondeterministic
out that any set accepted by a nondeterministic accepted by a deterministic
automaton
is
finite
automaton. It will turn automaton can also be
finite
finite
automaton. However, the nondeterministic
finite
a useful concept in proving theorems. Also, the concept of non-
determinism plays a central role computation, and
it
is
in
both the theory of languages and the theory of
useful to understand this notion fully in a very simple
we
meet automata whose deterministic and nonto be equivalent, and others for which equivalence is a deep and important open question. Consider modifying the finite automaton model to allow zero, one, or more transitions from a state on the same input symbol. This new model is called a nondeterministic finite automaton (NFA). A transition diagram for a nondeterministic finite automaton is shown in Fig. 2.5. Observe that there are two edges labeled 0 out of state q 0 one going back to state q 0 and one going to state q 3 An input sequence a x a
'
Later
initially.
deterministic versions are
shall
known not
.
,
•
the states q 0 , q 0 , q 0 , q 3 , g 4 ,
• •
4 , labeled 0,
two consecutive
strings with either
the previous section (deterministic the
NFA
in
which
for
0's
1,
FA, or
each state there
0, 0,
1.
This particular
NFA accepts all
or two consecutive Fs. Note that the
is
DFA
for
emphasis)
is
FA of
a special case of
a unique transition on each symbol. Thus
DFA, for a given input string w and state q there will be exactly one path labeled w starting at q. To determine if a string is accepted by a DFA it suffices to check this one path. For an NFA there may be many paths labeled w, and all must in
a
y
be checked to see whether one or more terminate at a
final state.
In terms of the picture in Fig. 2.3 with a finite control reading an input tape,
we may view at
the
NFA as
any time can be
in
also reading an input tape. However, the finite control any number of states. When a choice of next state can be
20
FINITE
AUTOMATA AND REGULAR EXPRESSIONS
V
J
o 1
Fig. 2.5
The
transition
diagram
automaton.
for a nondeterministic finite
made, as in state q 0 on input 0 in Fig. 2.5, we may imagine that duplicate copies of the automaton are made. For each possible next state there is one copy of the automaton whose finite control is in that state. This proliferation is exhibited in Fig. 2.6 for the
NFA
of Fig. 2.5 with input 01001.
Proliferation of states of an
Fig. 2.6
!
4
NFA.
Formally we denote a nondeterministic finite automaton by a 5-tuple (Q, I, S where Q, L, q 0 and F (states, inputs, start state, and final states) have the same meaning as for a DFA, but S is a map from g x I to 2 Q (Recall 2 Q is the
y
q 0 , F),
,
.
power
set
all states
of Q, the set of
all
p such that there
subsets of Q.)
is
The
intention
is
that 3(q, a)
a transition labeled a from q to
p.
is
the set of
2.3
Example
The
2.3
|
N ON DETERM IN ISTI C FINITE AUTOMATA
function 3 for the
NFA
of Fig. 2.5
given in Fig.
is
21
2.7.
Inputs States \
0
too,
1
<7 3 }
qi
0
{q 2 }
Q2
{q 2 }
{q 2 }
0
The mapping
Fig. 2.7
The
d for the
NFA
function 3 can be extended to a function
of Fig.
2.5.
Q
mapping
<5
x Z* to 2° and
reflecting sequences of inputs as follows:
%£) =
1)
wo)
2) 3(4,
Condition
(
{
=
1)
some
{p|for
state r in
<%,
p
w),
is
in
a)}.
disallows a change in state without an input. Condition (2) indicates and reading the string w followed by input symbol a we can
that starting in state q
be
in state
from
r
Note place of 3) for
{w
|
p
if
and only
if
one possible
state
we can be
in after
reading
w
is r,
and
we may go
3(P
y
to p upon reading a. that 3(q y a) 3(q, a) for a an input symbol.
=
3. It is
Thus we may again use 3 Z* by
w)={J qinP
3(q,
w)
P^
Q. L(M), where
each
set
c5(g 0 ,
w) contains a state
of states
M
is
the
NFA
<5(
Similarly,
=
3(3(q 0
,
0), 1)
(Q, £,
=
g 0 , F),
is
=
3({q 0
°) ,
=
q 3 ),
transition function 3
{<7o, q*}-
1)
=
1)
u
5fo 3
,
1)
we compute 6{q 09 010)
(5,
in F).
Example 2.4 Consider again the NFA of Fig. 2.5, whose was exhibited in Fig. 2.7. Let the input be 01001.
3(q 0y 01)
in
also useful to extend 3 to arguments in 2 Q x
{q 0
,
q 3 },
3(q 0 , 0100)
=
and 6(q 0 01001) ,
=
{
^
4 }.
{q 0 , * 3
.
qj
=
fo 0
,
Qii
AUTOMATA AND REGULAR EXPRESSIONS
FINITE
22
The equivalence of DFA's and NFA's
DFA
an
NFA,
by by DFA's). However, it turns out that these are the only sets accepted by NFA's. The proof hinges on showing that DFA's can simulate NFA's; that is, for every NFA we can construct an equivalent DFA (one which accepts the same language). The way a DFA simulates an NFA is to allow the states of the DFA to correspond to sets of states Since every
NFA's
is
clear that the class of languages accepted
is
it
includes the regular sets (the languages accepted
NFA. The constructed DFA keeps track in its finite control of all states that NFA could be in after reading the same input as the DFA has read. The formal
of the the
construction
is
embodied
in
our
theorem.
first
Theorem 2.1 Let L be a set accepted by a nondeterministic finite automaton. Then there exists a deterministic finite automaton that accepts L Proof
M
given time.
F
NFA
= (Q, I, 3 q 09 F) be an Let accepting L. Define a DFA, M' = (Q\ 3\ 4o> F'\ as follows. The states of M' are all the subsets of the set of states of M. could be in at any That is, Q' = 2 Q M' will keep track in its state of all the states 9
M
.
is
the set of
Q' containing a final state of M.
all states in
An element
are in Q. Observe that of Q' will be denoted by [q l9 q 2 , qx q ], where q l9 q 2 , corresponding to a set of states of the Qi] ls a single state of the .
,
.
. .
. ,
DFA
•
NFA. Note
that q 0 define
We
=
[q 0 ].
<5%i, if
and only
is, 6'
=
Qil a)
92,
[Pi>
P2 ,
Pjl
if
=
<*(fai. 02. •••, <*)
That
.
t
applied to an element [q l9 q 2 , Q represented by [q u q 2
each state of
.
.
.
,
{Pl, P2,---,Pj}-
q ] of t
qj.
,
On
Q
is
computed by applying 3
applying 3 to each
to
of^,^*
Pj- This new set q and taking the union, we get some new set of states, p ly p 2 Q' and that element is the value of of states has a representative, [p l9 p 2 pj] in ,
{
.
,
o"([lu q 2 ,
and only
The
Induction
xa be a
. .
. ,
9
=
[q l9
length of the input string x that
q 29
...
q]
9
l
if
HQo, x) Basis
,
show by induction on the ^'(QoyX)
if
.
^f], fl).
easy to
It is
.
result
is
trivial for
|x
|
=
{q l9 q 29
=0,
Suppose that the hypothesis
string of length
m+
1
since is
qi}.
q'
0
=
[^ 0 ]
true for inputs of length
with a in Z. Then
<5'(4o>
xa)
=
and x must be
<5'(<5'(4o,
x), a).
m
c.
or
less.
Let
2.3
By
23
the inductive hypothesis,
<%o> if
NONDETERMINISTIC FINITE AUTOMATA
|
and only
=
*)
bi, Pi, ...,pj
if
But by definition of 5\ #{[Pi,P2, if
and only
••.,/>;], fl)
=
[ri,r 2 , ...,rj
if
&({Pu P2..-..Py}.
=
fl)
,
{»
i,r 2 ,...,r
Jk
}.
Thus,
if
and only
if
%o.^) =
r 2 , ...,rk },
which establishes the inductive hypothesis. To complete the proof, we have only to add that S(q 0 x) contains a state of ,
Q
that
in F.
is
Since deterministic and nondeterministic
we
Example <*fao>
0)
2.5
=
Let
M=
{q 0 ,
We
({q 0 ,
q x },
8(q 0
1)
DFA
,
{0, 1}, 5,
=
=
q\\ and 0. Since S(q 0 0) ,
=
it
x)
,
NFA
= 0,
where
5{q l9 1)
d\ [q 0 ], F), accepting
<5'([<7o],
denote the elements of gj, we have
0)
= b0
,
=
L(M) as
We
{^ 0 ,
F exactly when
in
is
L(M').
automata accept the same sets, becomes necessary, but shall
&{q» 0)
{0, 1},
=
q 0 {q t }) be an
fail
can construct a M' (Q, consists of all subsets of {q 0 , q t }. [tfo>
finite
between them unless to both as finite automata.
shall not distinguish
simply refer
S'(q'0 ,
Thus L(M)
{q 0 , q t }.
follows.
Q
by
=
[q0 , q,].
gj.
Likewise, *'(fool 1)
=
0)
foil
Naturally, S'(0, 0)
=
<5'(0, 1)
= 0.
<5'([tfo>
= 0,
and
5%,],
1)
Lastly,
0)
=
[q 0 , qi \
since <5({<7o,
Qil 0)
=
5(g 0 , 0)
u
5{q l9 0)
=
{g 0 ,
tfi}
u
0 = fa0
,
ft},
Q
[q 0 ], [q Y ],
FINITE
24
AUTOMATA AND REGULAR EXPRESSIONS
and v(teo, Qil i)
=
l>o,
qil
since
<%o> The
set
F
9i}> 1)
of final states
In practice,
from the
2.4
it
S(qo, 1) is
DFA
u
1)
often turns out that
only
if
=
{^J
u
{^o, 4i}
=
{4o,
{[gj, [g 0 >
initial state [q 0 ]. It is
states to the
added
=
many
therefore a
states of the
NFA are not accessible
good idea to start with
state [q 0 ]
and add
they are the result of a transition from a previously
state.
AUTOMATA WITH e-MOVES
FINITE
We may
extend our model of the nondeterministic
finite automaton to include on the empty input c. The transition diagram of such an NFA accepting the language consisting of any number (including zero) of O's followed by any number of l's followed by any number of 2's is given in Fig. 2.8. As always, we say an NFA accepts a string w if there is some path labeled w from the initial state to a
transitions
final state.
Of course,
do not appear
edges labeled
e
may
be included
For example,
explicitly in w.
the
in the path,
word 002
of Fig. 2.8 by the path q 0 q 0 q 0 q u q 2 , q 2 with arcs labeled ,
,
,
o
Fig. 2.8
although the
accepted by the
is
e's
NFA
0, 0, £, e, 2.
l
Finite
automaton with (-moves.
Formally, define a nondeterministic finite automaton with e-moves to be a 3, q 0 F) with all components as before, but 3, the transition x (E u {e}) to 2 Q The intention is that function, maps a) will consist of all quintuple (Q, Z,
,
Q
states
symbol
3(q,
.
p such that there
is
a transition labeled a from q to
p,
where a
is
either £ or a
in Z.
Example
2.6
NFA
shown
The
transition function for the
now
extend the transition function 3 to a function 3 that maps
of Fig. 2.8
is
in Fig.
2.9.
We Q
shall
x X* to 2 Q Our expectation .
is
that
w)
will
be
all states
p such that one can
2.4
|
AUTOMATA WITH 6-MOVES
FINITE
25
Inputs States
0
40
Wo)
2
0
0 0 %,
Fig. 2.9
go from q
£
0 0
{<7i}
{*}
0
a) for the
0 NFA
of Fig.
2.8.
to p along a path labeled w, perhaps including edges labeled
constructing 3
will
it
be important to compute the
set
In
e.
of states reachable from a
e transitions only. This question is equivalent to the question of be reached from a given (source) vertex in.a directed graph. The
given state q using
what
vertices can
source vertex
graph
is
all
CWSURE(q) to denote the set of all labeled
and the directed and only the arcs labeled e. We use £vertices p such that there is a path from q to p
the vertex for state q in the transition diagram,
question consists of
in
e.
Example
In Fig. 2.8,
2.7
sisting of
q0
£-CLOSURE(g0 ) =
arcs labeled £.f Path q 0 , q x shows that q x shows that q 2 is in £-CLOSURE(g 0 ).
We may [j qinP
2)
naturally
£-CLOSURE(g).
1) S(q, e)
For
w
3) 6(R,
4) S(R,
let
in
X* and a p
is
in
in Z, d(q,
<5(r,
Note
a set of states, be
define 5 as follows.
wa)
= £-CLOSURE(P), where P =
{p
|
for
some
r
a)}.
by arcs labeled
that in this case, d(q, a)
is
not necessarily equal to S(q,
a),
reachable from q by paths labeled a (including while S(q, a) includes only those states reachable from q
all states
paths with arcs labeled
e),
not necessarily equal to S(q,
c). Therefore it is about an NFA with £-transitions. = (Q, Z, 3, q 0 F) to be define L(M), the language accepted by w) contains a state in F}.
a.
Similarly, d(q, e)
necessary to distinguish S from S
t
is
,
a)=\J 9inR d(q, a\ and w)=\J qinR S(q, w)
for sets of states R.
We
the path con-
,
£-CLOSURE(P), where P
Now we
is,
convenient to extend S and 3 to sets of states by
since S(q, a) includes
{w|<5(g 0 ,
in
is
That
a path from q 0 to q 0 with all £-CLOSURE(g 0 ) and path q 0 q t q 2 is
= 6-CLOSURE(g).
in S(q, w), It is
{q 0 , q u q 2 \
alone (there are no arcs on the path),
Remember
is
when we
talk
M
that a path of length zero has
no
arcs,
and therefore
,
trivially all its arcs are labeled
c.
AUTOMATA AND REGULAR EXPRESSIONS
FINITE
26
Example
Consider again the
2.8
d(q 0 ,
e)
NFA
of Fig.
2.8,
= c-CLOSUREfoo) =
{q 0 ,
q» q 2 ).
Thus S(q 09 0)
= £-CLOSURE(<5(%0
,
= £-CLOSURE(5(fa 0
q l9 q 2 \ 0))
=
,
£-CLOSURE(<5(oo, 0)
= £-CLOSURE({^ 0 } u
e),
u
0
0))
u
6{q l9 0)
5(q 2 0)) ,
u 0)
^a
= £-CLOSURE(fo 0 }) =
2 }.
Then d(q 09 01)
Equivalence of
NFA's
= £-CLOSURE(<5(% 0
>
= £-CLOSURE(S(fo 0
,
= £-CLOSURE({a
=
1
})
0), 1))
q l9 q 2 }
9
1))
{a 1 , q 2 ).
with and without e-moves
make transitions on £ does not allow the NFA We show this by simulating an NFA with ^transitions
Like nondeterminism, the ability to to accept nonregular sets.
by an
NFA
Theorem an
NFA
without such transitions.
2.2
If
L is
accepted by an
M = (Q, E,
Proof Let
<5,
q 0 F) be an ,
NFA
with ^transitions. Construct
AT =
where
(Q, £, 3\ q Qy F')
Fu
if
{q 0 }
F
and
NFA with ^transitions, then L is accepted by
without ^transitions.
£-CLOSURE(g 0 )
contains a state of F,
otherwise,
q in Q and a in X. Note that M' has no ^transitions. Thus but we must continue to distinguish between S and S.
d'(q, a) is d(q, a) for
we may
use
<5'
We wish
for
to
may £-CLOSURE(g0
statement
Basis
|x
Induction
|
=
3',
show by induction on not be true for x = ).
1.
\x\>
We
£,
|
is
Let x
a symbol
= wa 5'{q 0 ,
for
a,
and
S'(q 0 , a)
symbol a
wa)
=
= S(q 0 = {q 0
that d'{q 0 > x)
since 8'(q 0 , e)
therefore begin our induction at
Then x 1.
|x
in X.
=
x).
However,
while d(q 0
,
this
e)
=
1.
S(q 0 a) by definition of
Then
5'(5'(q 0 , w), a).
,
},
>
6'.
>
2.4
By
the inductive hypothesis, d'(q 0 w) ,
that
<5'(P,
a)
=
= S(q 0
,
w). Let S(q 0 ,
P=
P.
-MOVES
27
We must show
,
UP
UP
3(<7 0 ,
w)
%4
q in
4 in
as
=
w)
€
§(q 0 wa). But
V(P,a)= Then
AUTOMATA WITH
FINITE
|
we have qinP
by rule
(2) in the definition
of
5.
Thus
w") = %o>
<5'(4o>
To complete only
if
for
show
,
definition of F".
which
the proof we shall
S(q 0 x) contains a state of F. If x is
mi). if and immediate from the whenever d(q 0 c),
that S'(q 0y x) contains a state of F'
=
£,
this
statement
is
F
= {q 0 }, and q 0 is placed in contains a state (possibly q 0 ) in F. If x ^ £, then x = wa ), If <5(g 0 x) contains a state of F, then surely S'(q 0 x) contains
That
is,
d'(q 0 , e)
,
£-CLOSURE(g 0
some symbol same state in
a.
,
,
than q 0 then 3(g 0 , x) contains a state in F. If S'(q 0 x) contains q 09 and q 0 is not in F, then as 5(q 0 x) = £-CLOSURE(<5(<5(g 0 , vv), a)), the state in £-CLOSURE(g 0 ) and in F the
Conversely,
F'.
if
S'(q 0y x) contains a state in F' other
,
,
,
must be
in S(q 0 , x).
Example 2.8.
2.9
Let us apply the construction of
In Fig. 2.10
we summarize
transition function 2.2.
The
d'
of the
set of final states
d(q, a).
NFA
Theorem
We may
2.2 to the
of Fig.
without ^transitions constructed by Theorem
F includes q 2 because that
£-CLOSURE(g 0 ) and F have gram for M' is shown in Fig. 2.11. because
a state q 2 in
is
in
F and
{go,
1
qu
qi]
0 0
12 Fig. 2.10
{qu qi}
{qi}
{qn qi}
{qi}
0
{q 2 }
l(q, a) for Fig. 2.8.
0,
NFA
I.
also includes g 0 , transition dia-
common. The
Inputs
0
States
Fig. 2.11
NFA
also regard Fig. 2.10 as the
2
without (-transitions.
AUTOMATA AND REGULAR EXPRESSIONS
FINITE
28
REGULAR EXPRESSIONS
2.5
The languages accepted by
finite
automata are
easily described
sions called regular expressions. In this section
we
by simple expres-
introduce the operations of
concatenation and closure on sets of strings, define regular expressions, and prove that the class of languages accepted
by
automata
finite
is
precisely the class of
languages describable by regular expressions.
X
Let
be a
finite set
of symbols and
L,
let
L u and L 2
The concatenation of L and L 2 denoted L X L 2 ,
x
L 2 That
is,
}.
it i
L L2
the strings in
X
,
is
be sets of strings from X*. in L and y is in L and following = {e} and L = LL~ for
the set {xy \x
by a string in L 2 in all possible combinations. Define L° 1. The Kleene closure (or just closure) of L, denoted L*, ,
>
L*= i
and the
positive closure of L,
is
are formed by choosing a string
denoted
L+
Q =
x
t
1
is
the set
IL
0
,
is
the set
00
v=u
i-
That is, L* denotes words constructed by concatenating any number of words + from L. L is the same, but the case of zero words, whose "concatenation" is + defined to be c, is excluded. Note that L contains c if and only if L does.
Example
Let
2.10
L =
and L 2
{10, 1}
x
=
{Oil, 11}.
Then LjL 2
=
{10011, 1011,
111}. Also,
{10, 11}* If
X
is
=
{(,
an alphabet, then X* denotes
previously stated. Note that
we
a language of strings of length
Let
10, 11, 1010, 1011, 1110, 1111, ...}.
X be an
alphabet.
The
all strings
of symbols in X, as
are not distinguishing
X
we have X as
as an alphabet from
1.
regular expressions over
X and
the sets that they
denote are defined recursively as follows. 1)
0
is
a regular expression and denotes the empty
a regular expression and denotes the set
2)
c
3)
For each a
is
4) If r
and
then
(r
5
+
in X,
at
is
set.
{c}.
a regular expression and denotes the set
are regular expressions denoting the languages 5), (rs),
and
(r*) are regular expressions that
{a}.
R and 5, respectively,
denote the
sets
RuS,
RS, and R*, respectively. t
To remind
the reader
when
a
symbol
is
part of a regular expression,
However, we view a and a as the same symbol.
we
shall write
it
in boldface.
2.5
REGULAR EXPRESSIONS
|
29
In writing regular expressions we can omit many parentheses if we assume that * has higher precedence than concatenation or + , and that concatenation has
higher precedence than
may
.
between a regular expression latter.
When no
0) may be written 01* + 0. We When necessary to distinguish
+ For example, ((0(1*)) +
also abbreviate the expression rr* by r +
confusion
is
r
.
and the language denoted by
possible
we
we use
r,
L(r) for the
use r for both the regular expression and
the language denoted by the regular expression.
Example
00
2.11
(0+1)* denotes
and
strings of 0's
+
10)* denotes
The expression
a regular expression representing {00}.
is
of 0's and
all strings
l's.
Thus, (0
+
with at least two consecutive
l's
1)*00(0
0's.
The
+
1)* denotes all
regular expression
0's and l's beginning with 1 and not having two an easy induction on i that (1 + 10)' does not have two consecutive 0's.t Furthermore, given any string beginning with 1 and not having consecutive 0's, one can partition the string into l's, with a following 0 if there is one. For example, 1101011 is partitioned 1-10-10-1-1. This partition shows that any such string is in (1 -h 10)% where i is the number of l's. The regular (1
consecutive
0's.
+
expression (0
all strings
In proof,
e)(l
+
it
of
is
10)* denotes
not have two consecutive
For some additional examples, ending
in 01
1.
all
strings of 0's
and
l's
whatsoever that do
0's.
Also, 0*1*2* denotes
(0 + 1)*011 denotes all strings of 0's and l's any number of 0's followed by any number of
l's followed by any number of 2's. This is the language of the NFA of Fig. 00*11*22* denotes those strings in 0*1*2* with at least one of each symbol. may use the shorthand 0 + l + 2 + for 00*11*22*.
Equivalence of
We now
finite
2.8.
We
automata and regular expressions
turn to showing that the languages accepted by finite automata are
precisely the languages denoted
by regular expressions. This equivalence was the automaton languages regular sets. Our plan will be to show by induction on the size of (number of operators in) a regular expression that there is an NFA with t-transitions denoting the same language. Finally, we show that for every DFA there is a regular expression denoting its language. These constructions, together with Theorems 2.1 and 2.2, show that all four language defining mechanisms discussed in this chapter define the same class of languages, the regular sets. Figure 2.12 shows the constructions we shall perform or have performed, where an arrow from A to B means that for any descriptor of type A a motivation for calling
finite
construction yields an equivalent descriptor of type B.
We proceed to prove NFA with £-transitions. t If r
is
a regular expression,
r'
that for every regular expression there
stands for rr
•
-
r
(/
times).
is
an equivalent
FINITE
30
AUTOMATA AND REGULAR EXPRESSIONS
Constructions of
Fig. 2.12
Theorem
Let
2.3
this chapter.
be a regular expression. Then there
r
NFA
an
exists
with
£-transitions that accepts L(r).
We
Proof
show by induction on
sion r that there
the
number
transitions out of this final state, such that
The expression
Basis (Zero operators)
NFA's
in Fig. 2.13(a), (b),
(a) r
=
and
Fig. 2.13
Finite
three cases depending
case
r
1
=
rx
L(M
t )
=
L(rj)
+ r 2 Both = (Q u Z
and
we may assume final state.
Q
x
L(M 2 ) and
i)
ii)
is
automata
for basis step of
Assume
operators,
and
r2
i
>
some a
or a for
Theorem
that the theorem 1.
Let r have
i
The
=
a
2.3.
true for regular
is
operators. There are
must have fewer than
M
i
operators.
Thus
there
5 l9 q l9 {fx }) and 2 = (Q 2 I 2 5 2 g 2 {/2 }) with = L(r 2 ). Since we may rename states of an NFA at will, ,
,
new
are disjoint. Let q 0 be a
u Q2 u
{^o,/o}»
u £2
>
5,
g in
Q — i
{/j}
and a
in
Z u x
,
initial state
g0
>
defined by
%)> 0 = fai»^2}. <5(g, a) = Si(q, a) for
in S.
r.
lf
Q2 t
and no
conditions.
Construct
M = (Q where 3
rx
0,
£,
(c) r
on the form of
.
NFA's M,
are
i
must be
final state
L(r).
0
=
more operators)
Induction (One or
expressions with fewer than
r
L(M) =
(c) clearly satisfy the
(b) r
e
of operators in the regular expres-
M with £-transitions, having one
NFA
an
is
{e},
{/o})>
,
and f0 a new
/ 2.5
iii)
= 3 2 (q, =
S(q, a)
iv)
a) for q in *) =
Q2 -
{f2 } and a
I2 u
in
x
or
M
.
2
Thus
all
the
The construction diagram of
M from q
goes to q l9
it
may
of
31
{<:},
{/o}-
no
Recall by the inductive hypothesis that there are
M
REGULAR EXPRESSIONS
|
moves of
M
is
M
M
and
x
2
transitions out of /j
are present in
depicted in Fig 2.14(a).
or/2
in
M.
Any path
in the transition
begin by going to either q t or q 2 on e. If the path to/j and then go to f0 on t. Similarly, follow any path in x 0 to
f0 must
M
M
paths that begin by going to q 2 may follow any path in 2 to/2 and then go to f0 on e. These are the only paths from q 0 to f0 It follows immediately that there is a .
path labeled x
in
M from q to M from q 0
(h to/i or a path in
2
0 if and 2
only
if
there
to/2 Hence L(M) .
is
a path labeled x in
= L(M
X )
u
M
t
from
L(JH)^ as desired.
(c)
Constructions used
Fig. 2.14
in
induction of
Theorem
2.3. (a)
For union,
catenation, (c) For closure.
case 2
r
=
r1 r2 .
M and M be as Case M = (6i u Q X, u S
Let
in
2
t
2 , 5,
2,
where i)
is
given by
<5(g,
a)
=
(5(g,
= {92} = a) 2 (g,
(5
«) iii)
for g in
Qj
—
a) for g in
Q2
and a
^(g, a)
(5
1
{/J and a
in
fo}, {/2 }),
in JL
I2 u
and construct
l
{e}.
u
{c},
(b)
For con-
/ AUTOMATA AND REGULAR EXPRESSIONS
FINITE
32
/
M
M
The construction of is given in Fig. 2.14(b). Every path in from q x to f2 is a path labeled by some string x from q x to fx followed by the edge from fx to q 2 labeled e, followed by a path labeled by some string y from q 2 to f2 Thus L(M) = ,
.
{xy\x
L(M
in
is
case 3
=
r
t )
and y
Mj =
Let
r*.
L(M 2 )} and L(M) = L(M )L(M 2
in
is
1
2 ls
(Qi,
M = (0! where
%o.0 = 5 (/i»
i) ii)
=
a)
<5(g,
u
{/i})
l5 tfi,
S lf
{q 09 f0 } 9
L{M
and <5,
g0
>
X
)
=
as desired.
Construct
{/o})>
given by
is
(5
<5
)
£)
=
fei»/o}.
Q -
d x (q, a) for g in
The construction
M
of
{fx } and a
x
in
Z u t
{e}.
depicted in Fig. 2.14(c).
is
Any path from q 0
to
f0
consists either of a path from q 0 to f0 on £ or a path from q 0 to q x on c, followed by some number (possibly zero) of paths from q i to fx , then back to q t on 6, each labeled by a string in L(M X \ followed by a path from q x to t on a string in L(MJ,
then to f0 on
each x
in
is
(
Example
Thus x x2
e.
=
can write x
L(M
+
where
r 2>
•
).
•
NFA
Let us construct an
2.12
our precedence r\
x
M
is a path in from g 0 to f0 labeled x if and only if we x} for some 7 > 0 (the case 7 = 0 means x = c) such that Hence L(M) = L(Mj)* as desired.
there
•
x
expression
rules, this
rx
=01* and
=
r2
1.
is
for the regular expression 01* really (0(1*))
+
1,
so
it
is
The automaton
for r 2
=
The automaton
is
easy;
+
1.
By
of the form
it is
Stait
We may express
as r 3 r 4 where r 3
rx
,
0 and
r4
=
1*.
for r 3
is
also
easy:
—\jy
Stan
In turn, r 4
is r*,
where
r5
is
1.
Start
Note
that the need to
using the same
To r 4 is
shown result
in Fig. 2.15(a).
is
shown
NFA
f0
for r
—(
V5
for r 5
and
r5 ,
for r 4
=
1
rx
+
Then, for r2
.
Two
disjoint prohibits us
from
r| use the construction of Fig. 2.14(c). Create
rx
and
=
states
and the
0 , respectively.
r 3 r4
in Fig. 2.15(b). Finally,
=
automata
although they are the same expression.
roles of q 0
in that construction,
is
^
states of different
and q s playing the
The
find the
keep for r 2
NFA
construct an
states q n
q 0 and
NFA
An NFA
The resulting
NFA for
use the construction of Fig. 2.14(b).
use the construction of Fig. 2.14(a) to
q 9 and q l0 are created to
result
is
shown
fill
in Fig. 2.15(c).
the roles of
\
2.5
REGULAR EXPRESSIONS
|
33
e
Start
(a)
€
Start-
(b)
Start
(c)
Fig. 2.15
Constructing an
NFA r,
=
from a regular expression, For r = 01* + 1.
(a)
For
r4
=
1*.
For
(b)
01*. (c)
The proof of Theorem
2.3 is in essence an algorithm for converting a regular automaton. However, the algorithm implicitly assumes that the regular expression is fully parenthesized. For regular expressions without redundant parentheses, we must determine whether the expression is of the form
expression to a
finite
p + q, pq, or p*. This is equivalent to parsing a string in a context-free language, and thus such an algorithm will be delayed until Chapter 5 where it can be done
more
elegantly.
Now we
must show that every set accepted by a finite automaton is denoted by some regular expression. This result will complete the circle shown in Fig. 2.12.
Theorem
2.4
If
L
is
accepted by a
DFA,
then
L
is
denoted by a regular expres-
sion.
Proof
Let
L
be the
set
accepted by the
M = {fa,
DFA
2, 5, q l9 F).
Let Rlj denote the set of all strings x such that d(q iJ x) = q and if 3(q h y) — qr for p any y that is a prefix (initial segment) of x, other than x or £, then { < k. That is, Ry is the set of all strings that take the finite automaton from state q to state qi without going through any state numbered higher than k. Note that by "going through a state," we mean both entering and then leaving. Thus i or j may be ,
{
greater than
k.
Since there
is
no
state
numbered
greater than
m,
R"j denotes all
FINITE
34
AUTOMATA AND REGULAR EXPRESSIONS
strings that take (2.1)
= a) =
\{a\6(q i9 a)
qj)
\{a\S(q h
qj)
u{£}
if
if;,
if
1=7.
Informally, the definition of above means that the inputs that cause go from q to qs without passing through a state higher than q k are either
M to
{
1) in R^j'
1
(that
they never pass through a state as high as q k ); or
is,
~
2)
M from state q
(which takes
We
must show that
Basis
=
(k
= j),
is
0).
symbol. Thus
r°-
a
k
for each
We
denoting the language R^.
1
M
composed of a string in R*[k 1 (which takes to q k for the first time) followed 1 from qk back to q k without by zero or more strings in Rl k (which take 1 passing through q k or a higher-numbered state) followed by a string in R^J to q}). z,
j,
finite set
there are
no such
Induction
The
a p}
,
a's,
then
is
and
/c,
there exists a regular expression
proceed by induction on
a2
-f
the set of
0 (or
e
+" + '
all
R^
recursive formula for
a p (or a 1
i
given
= j)
L(r*~
!
)
=
R*~
1
m
.
Thus
for rfy
either
+
a2
+
+
'
serves as
and
ap
a)
=
+
e if
qy
If
r°-.
closure.
By the
there exists a regular expression rj~
we may
or a single
£ '
in (2.1) clearly involves
regular expression operators: union, concatenation, hypothesis, for each f and
is
symbols a such that 5(q h
in the case
rjj
/c.
of strings each of which
can be written as a x
where {a^ a 2
M
1
only the
induction
such that
select the regular expression
qj in
since R"j denotes the labels of all paths
F
from q to x
qj
.
Thus L(M)
is
denoted by the
regular expression
where
F=
Example and
v
{g,
2.13
for k
=
0,
qj2
,
Let 1,
M be the FA shown in Fig. 2.16. The values of
regular expressions such as
(r -f s)t
=
rt 4- st
simplify the expressions (see Exercise 2.16).
expression for r\ 2
rjj
for all
or 2 are tabulated in Fig. 2.17. Certain equivalences
is
1
(r? 1 )*r?2
and j
and (e -4- r)* =? r* have been used to For example, strictly speaking, the
given by
=^
i
among
+ rL =
0(£)*0
+
£.
2.5
REGULAR EXPRESSIONS
|
35
1
FA
Fig. 2.16
k
=0
Example
for
k=
1
2.13.
fc
=
2
e
e
(00)*
0
0
0(00)*
1
1
0*1
0
0
0(00)*
/i 3
1
6+00 1+01
(00)*
A, r*3,
0
0
^32
0+1
0+1
As
e
£
+ 1)(00)*0 (0 + 1)(00)* e + (0+1)0*1
e
Tabulation of
Fig. 2.17
0*1 (0
rf,
FA
for
of Fig. 2.16.
Similarly,
rh = Recognizing that (e
+
0)1,
r\ 2 (r 22 )*r 23
+
(6
00)*
is
+
r} 3
= 0(6 +
01)
equivalent to (00)* and that
=
+
0(00)*(6
Observe that (00)*(6 + 0) is equivalent to alent to 00*1 + 1 and hence to 0*1.
To complete
+
+
1
+
+
1.
01
is
equivalent to
we have r? 3
r i2
00)*(1
^13,
we
0)1
0*.
+
1.
Thus 0(00)*(6
+
0)1
the construction of the regular expression for
+
=
r i3( r 33)* r 32
=
0*1(6
=
0*1((0
+
(0
+
+
+
r l2
+
1)0*1)*(0
1)0*1)*(0
+
1)(00)*
1)(00)*
+
+
0(00)*
0(00)*
and
rh = rUrh)*rh +
=
0*1(6
=
0*1((0
+
(0
+
+
2
r 13
1)0*1)*(6
+
(0
+
1)0*1)
+
0*1
1)0*1)*.
Hence r\ 2
+
3
r i3
=
0*1((0
+
1)0*1)*(6
+
(0
+
1)(00)*)
+
is
equiv-
M, which
write r l2
1
0(00)*.
is
.
FINITE
36
2.6
AUTOMATA AND REGULAR EXPRESSIONS
TWO-WAY
AUTOMATA
FINITE
We
have viewed the finite automaton as a control unit that reads a tape, moving one square right at each move. We added nondeterminism to the model, which allowed many "copies" of the control unit to exist and scan the tape simultaneously. Next we added e-transitions, which allowed change of state without reading the input symbol or moving the tape head. Another interesting extension is to allow the tape head the ability to move left as well as right. Such a finite automaton is called a two-way finite automaton. It accepts an input string if it moves the tape head off the right end of the tape, at the same time entering an accepting state. We shall see that even this generalization does not increase the power of the finite automaton; two-way FA accept only regular sets. We give a proof only for a special case of a two-way FA that is deterministic and whose tape head must move left or right (not remain stationary) at each move. A more general model is considered
in the exercises.
M
= (Q, X, S two-way deterministic finite automaton (2DFA) is a quintuple q Qy F), where Q X, q Qy and F are as before, and 3 is a map from Q x X to Q x {L, R}. If 3(q, a) = (p, L), then in state q, scanning input symbol a, the 2DFA enters state p and moves its head left one square. If S(q, a) = (p, R) the 2DFA A
y
y
y
enters state p and
moves
its
head right one square.
one-way FA, we extended S to Q x £*. This as receiving a symbol on an input channel,
In describing the behavior of a
FA
corresponds to thinking of the
processing the symbol and requesting the next. This notion is insufficient for the two-way FA, since the 2DFA may move left. Thus the notion of the input being written on the tape is crucial. Instead of trying to extend 3 we introduce the notion of #n instantaneous description (ID) of a 2DFA, which describes the input string, current state, and current position of the input head. Then we introduce can go from the relation 1^ on ID's such that l \jfl 2 if and only if the instantaneous description I to I 2 in one move. An ID of is a string in The ID wqx where w and x are in X* and q y
M
x
{
M
in
is
1)
is
intended to represent the facts that
wx
is
the input string,
2) q
the current state,
is
head
3) the input If
x
=
flf-i^fli
R),
2) a x a 2 (p,
symbol of
first
moved f—
•
•
L) and
left
•• a n
{—a
qa,
an |— a, a 2
l
a2
'
off the right if
M
a .
'
x.
i
l
is
end of the input.
understood, by a n whenever 3(q,
•••
a pa i+l i
a,)
=
and 2 i
a,_
>
The condition > off the
and
scanning the
define the relation \jf or just
a x a2 (p,
is
then the input head has
c,
We 1)
y
Q,
,
• • ?
•
•
• •
a,_ 2 pa,-_
{
a
• • ,
i
•
a n whenever d(q, a
t
)
=
1
1 prevents any action in the event that the tape head would move i end of the tape. Note that no move is possible if i = n + 1 (the tape
2.6
TWO-WAY
|
FINITE
AUTOMATA
37
head has moved off the right end). Let P - be the reflexive and transitive closure of / i for I k whenever /ih / f— That is, / p- / for all ID's /, and l x 5
2h'"h
.
some
72 ,
We
define
L(M) That head
is,
=
{w q 0 w
wp
|
for
some
p
in F}.
M M
w is accepted
if, starting in state by q 0 with w on the input tape and the end of w, eventually enters a final state at the same time it falls end of the input tape.
at the left
off the right
state q Example 2.14 Consider a 2DFA M that behaves as follows: Starting M repeats a cycle of moves wherein the tape head moves right until two have in
0,
l's
been encountered, then
left
until encountering a 0, at
More
reentered and the cycle repeated. are final; 3
which point
M has three
precisely,
state q 0
states, all of
is
which
given in Fig. 2.18.
is
0
The
Fig. 2.18
qo
fao,
qx
(q u
qi
(qo,
1
R) R) R)
fai,
R)
(q 2 , L) (qi, L)
transition function for the
2DFA
of
Example
2.14.
To
Consider the input 101001. Since q 0 is the initial state, the first ID is q 0 10 1001. obtain the second ID, note that the symbol to the immediate right of the state
q0
in the first
Continuing tually
ID
is
a
1
and 3(q 0
in this fashion
moves
off the right
we
,
1) is (q l9
get the result
end of the tape
in
Thus the second ID is l^OlOOl. evenshown in Table 2.1. Hence an accepting state. Thus 101001 is in
R).
M
L(M).
Table g 0 101001
2.1
|— 1^01001 |— 10 gi 1001 |— lg 2 01001 |— 10^o 1001 \— 101(7,001
f— \— |— |—
1010(7,01
10100(7,1 1010(7 2 01
10100(7o
1
f— lOlOOlg,
38
FINITE
AUTOMATA AND REGULAR EXPRESSIONS
Crossing sequences
A
useful picture of the behavior of a
2DFA consists of the input, the path followed
by the head, and the state each time the boundary between two tape squares is crossed, with the assumption that the control enters its new state prior to moving the head. For example, the behavior of the 2DFA of Example 2.14 on 101001 is
M
shown
in Fig. 2.19.
10 10 % - 1\ -
The
list
h
1
-N
2DFA
Behavior of the
Fig. 2.19
sequence.
C
0
of Example 2.14.
of states below each boundary between squares
Note
that
if
a
2DFA
repeated state with the head
being deterministic, would be
accepts
moving
its
is
in the
a loop and thus could never
in
termeJ a crossing
no crossing sequence may have a same direction, otherwise the 2DFA,
input,
fall
off the right end.
Another important observation about crossing sequences is that the first time a boundary is crossed, the head must be moving right. Subsequent crossings must be in alternate directions. Thus odd-numbered elements of a crossing sequence represent right moves and even-numbered elements represent left moves. If the input is accepted, it follows that all crossing sequences are of odd length. if it is of odd length, and A crossing sequence q ,q 2 tf* is said to be valid no two odd- and no two even-numbered elements are identical. A 2DFA with s states can have valid crossing sequences of length at most 2s, so the number of Y
valid crossing sequences
,
•
•
•
,
is finite.
M
is regular is to showing that any set accepted by a 2DFA construct an equivalent NFA whose states are the valid crossing sequences of M. To construct the transition function of the NFA we first examine the relationship between adjacent crossing sequences. Suppose we are given an isolated tape square holding the symbol a and are also given valid crossing sequences q l3 q 2 pf at the left and qk and p l9 p 2 right boundaries of the square, respectively. Note that there may be no input strings that could be attached to the left and right of symbol a to actually produce these two crossing sequences. Nevertheless we can test the two sequences for local compatibility as follows. If the tape head moves left from the square holding a in state q h restart the automaton on the square holding a in state q i+v Similarly, whenever the tape head moves right from the square in state p h restart the autom-
Our
strategy for
,
.
.
.
,
>
.
.
.
,
,
2.6
|
TWO-WAY
FINITE
AUTOMATA
39
aton on the square in state p i+1 By this method we can test the two crossing sequences to be sure that they are locally consistent. These ideas are made precise below. .
We define right-matching and
left-matching pairs of crossing sequences recurthrough (v) below. The intention is for q u q 2 qk to right-match p 1? Pi-> Ve on a ^ these sequences are consistent, assuming we initially reach a in state q t moving right, and for the two crossing sequences to left-match if the sequences are consistent, assuming we initially reach a in state p l moving left. In each case, we take q u q 2 qk to appear at the left boundary of a and p 1? p 2 sively in
(i)
,
. .
,
pe
at the right
i)
The
.
.
. ,
.
,
. ,
.
.
boundary.
null sequence
and right-matches the null sequence. That
left-
never reach the square holding
a,
then
it is
if
is,
we
consistent that the boundaries on
neither side should be crossed. ii)
q3, q k right-matches p l9 right-matches p u That is, , pf
pe and S(q l9 a)=
If
. .
.
.
the
if
first
(q 2 , L), then
crossing of the
q u ..., qk boundary is in
left
and the head immediately moves left in state q 2i then if we follow two crossings by any consistent behavior starting from another crossing
state q l
these
of the ing iii)
left
boundary, we obtain a consistent pair of sequences with right, i.e., a right-matched pair.
first
cross-
moving
q2 qk left-matches p 2 , right-matches p u ... pr That
If
p( and 5{q x a)
,
y
,
the
is, if
first
—
(p 1?
R\
crossing of the
then q l9
boundary
left
qk is
in
and the head immediately moves right in state p l7 then if we follow these two crossings by any consistent behavior starting from a crossing of the right boundary, we obtain a consistent pair of sequences with the first crossing from the left, i.e., a right-matched pair. Note that this case introduces the need for left-matched sequences, even though we are really only interested in state q x
right-matched pairs. iv) If
q l3
q k left-matches p 3
left-matches p 1? v) If q 2 ,
pf
.
The
is
,
a)
=
(p 2 , R), then q u
similar to that for rule
pf and 5(p ly a) = (q u pr The justification is similar to rule
qk right-matches p 2
left-matches p l3
pf and 5(p 1
,
justification ,
.
M
L),
qk
(ii).
then q u
qk
(iii).
Consider the 2DFA of Example 2.14 and a tape square consymbol 1. The null sequence left-matches the null sequence, and ^(^o» 0 = (qi, R)- Thus q Q right-matches q x on 1 by rule (iii). Since S(q ly 1) = (q 2 L), q l9 q l9 q 0 right-matches q x on 1 by rule (ii). This must be the case, since there is in fact an accepting computation in which this pair of sequences actually occurs to the left and right of a square holding a 1. Note, however, that a pair of sequences could match, yet there could be no computation in which they appeared adjacent, as it could be impossible to find strings to place to the left and right that would "turn the computation around" in the correct states.
Example
2.15
taining the
,
AUTOMATA AND REGULAR EXPRESSIONS
FINITE
40
Equivalence of one-way and two-way
L
finite
automata
2DFA,
L
Theorem
2.5
Proof
= (Q, Z, <5, q 0 F) be a 2DFA. The proof consists of constructing an which accepts L(M). Define M' to be (Q\ Z, <5\ q'0 F'), where
1)
q'
valid crossing sequences for
all
a)
=
Note
{d\d
is
valid crossing sequence that
a.
that as d
intuitive idea
valid
is
that
is
scans the input string. This
M'
set.
M.
the set of all crossing sequences of length
is
(5'(c,
The
a regular
the crossing sequence consisting of q 0 alone.
Q is
a}.
is
,
2)
4)
then
,
g' consists of
3) F'
accepted by a
is
M
Let
NFA M'
If
has guessed that c
on input
a. If
the guessed
M'
length.
puts together pieces of the computation of
M as
done by guessing successive crossing sequences.
is
is
it
If
the next
guess any valid crossing sequence that c right-matches
computation
input in an accepting state, then
We now
must be of odd
state in F.
right-matched by c on input
the crossing sequence at a boundary, and a
is
M' can
input symbol, then
it
one consisting of a is
M'
=
results in
M moving off the right end of the
accepts.
w
L(M). Look at the crossing on w. Each crossing sequence right-matches the one at the next boundary, so M' can guess the proper crossing sequences (among other guesses) and accept. Conversely, if w is in L(M'), consider the crossing sequences c 0 c I? ., c n of corresponding to the states of M' as M' scans w = a a 2 "' a„. For each i, 0 < < n, c, right-matches c, + l on a We can construct an accepting computation of on input w by determining when the head reverses direction. In particular, we prove by induction on i that M' on reading a a 2 ** a, can enter state
show
L(M')
that
L(M). Let
be
in
sequences generated by an accepting computation of
M
.
,
M
.
l
i
{
.
M
l
Ci
= 1)
qk ] only
-
[<7i,
M started
if
in state
q 0 on a a 2 "' a will
first
{
l
move
right
from position
i
in state
q l9 and
=
2) for j
Basis
=
(i
"moving
.
0).
•••
.
.
As
right"
Induction
a l a1
2, 4,
,
M
if
from position
right
a,
c0
=
is /
started at position
in state
[q 0 ],
( 1 )
from position 0
Assume
and ^ are odd, and c, _ in state q on input a h j it
is
x
5(pj, a
t
)
=
(p i+
1?
q0
qp
.
M
Condition i
—
1.
M will eventually move
k must be odd).
begins
its
computation by
(2) holds vacuously.
Suppose
that
M' on reading
<5fJ. Since #J from state c,_! = right-matches c, on a h there must exist an odd j such that moves right. Let j 1 be the smallest such j. By definition of
c,
=
/c
[p t ,
M
follows that S(qjl , a x )
R)
in state
satisfied since
in state
definition of "right-matches" (rule if
t
the hypothesis true for
can enter state
"right-matches"
qj+
i
(this implies that
for all
even j,
=
(p x , R). This proves
(1).
Also by the
p,]. Now q k ] left-matches [p 2 then (2) follows immediately. In the case that
iii)
[qh +
ls
. . . ,
,
. .
.
,
2.6
TWO-WAY
|
FINITE
AUTOMATA
41
for some smallest even j 2i S(pj2 a ) = (q> L), then by the definition of "leftmatches" (rule v) q must be qjl + 1 and [q jl + 2 pj. q k ] right-matches [pj2 + x The argument then repeats with the latter sequences in place of c _ x and c ,
t
,
.
.
.
,
,
.
.
. ,
.
f
With the induction hypothesis
some p
in
F
implies that
for all
M accepts a
l
a2
i
f
established, the fact that c„
" an
=
[p] for
.
Example 2.16 Consider the construction of an NFA M' equivalent to the 2DFA of Example 2.14. Since q 2 is only entered on a left move, and q and q 2 are only
M
l
even-numbered components of valid crossing sequences must be q 2 Since a valid crossing sequence must be of odd length, and no two odd-numbered states can be the same, nor can two even-numbered states be entered on right moves,
all
.
the same, there are only four crossing sequences of interest; these are listed in Fig. 2.20 along with their right matches.
Valid crossing
Right matches
sequences
on 0
Right matches
on
M
[%]
[9.]
qo]
[ill
[
92, qo]
Fig. 2.20
We structed
1
[9,]
Valid crossing sequences along with their right matches.
note immediately that state [q 0 q 2y q ] may be removed from the conNFA M', since it has no right match. The resulting M' is shown in Fig. ,
{
Note that L(M') = (c + 1)(0 + 01)*, that is, all strings of O's and l's without two consecutive l's. Consider the input 1001, which is accepted by M' using the sequence of states
2.21.
[tfoL
2.22.
W
[ 4o]>
Note
that S(q 0 , 1)
=
(q i3
e can visualize the crossing sequences as in Fig. R) justifies the first move and that 3(q l 0) = (q u R) ,
0
0
Fig. 2.21
The
NFA
M'
constructed from the
2DFA M.
'
FINITE
42
AUTOMATA AND REGULAR EXPRESSIONS
10 %-
- -
1
^1
Crossing sequences of
Fig. 2.22
justifies the
0
second and
third. Since
<%,,
1)
2DFA
=
on input 1001.
(q 2 , L)
we
see the justification for
Then 3(q 2i 0) = (q0 R) (q u R) explains the last
the fourth move, which reverses the direction of travel.
again reverses the direction, and finally 3(q0
,
1)
=
,
move.
AUTOMATA WITH OUTPUT
2.7
FINITE
One
limitation of the finite
automaton as we have defined it is that its output is Models in which the output is chosen from some other alphabet have been considered. There are two distinct limited to a binary signal: "accept'V'don't accept."
approaches; the output
may
be associated with the state (called a
or with the transition (called a then
show
Mealy machine).
Moore machine)
We shall define each formally and
two machine types produce the same input-output mappings.
that the
Moore machines
A Moore
machine
<5, X, q 0 ), where Q, Z, 3 and q 0 are as in and X is a mapping from Q to A giving the output associated with each state. The output of in response to input a t a 2 "' a n n > 0, is A(g 0 )A(4i) X(q„), where q 0 q u is the sequence of states such q„ that <5(
the
DFA. A
is
is
a six-tuple (Q, X, A,
y
the output alphabet
M
'
>
'
. .
,
.
,
{
k(q Q ) in response to input
e.
The
DFA may be
machine where the output alphabet ;-(
is {0, 1}
state q
is
"accepting"
if
Moore
and only
if
i-
Example
Suppose we wish to determine the residue
2.17
string treated as a binary integer.
followed by a
0,
To
begin, observe that
the resulting string has value
the resulting string has value 2i
1,
viewed as a special case of a
and
+
1.
2z,
If the
and
if
i
in
mod if
i
3 for each binary
written in binary
binary
remainder of
is
is
followed by a
i/3 is p,
then the
p = 0, 1, or 2, then 2p mod 3 is 0, 2, or 1, respectively. Similarly, the remainder of (2i -h l)/3 is 1, 0, or 2, respectively. It suffices therefore to design a Moore machine with three states, q 0y q u and q 2 where q} is entered if and only if the input seen so far has residue j. We define remainder of
,
2//3
is
2p
mod
3.
If
2.7
|
FINITE
AUTOMATA WITH OUTPUT
110
0
2
0
X(qj)
=j
for j
=
and
0, 1,
2.
The
outputs label the states.
1
A Moore machine
Fig. 2.23
43
In Fig. 2.23
calculating residues.
we show
transition function S
the transition diagram, where is
designed to
reflect the rules
regarding calculation of residues described above.
On
input 1010 the sequence of states entered
sequence 01221. That decimal) has residue
is q 0 q u q 2 q 2
,
,
is, e
2,
Mealy machines
A
Mealy machine
Moore machine,
is
also a six-tuple
M = (Q, Z, A,
maps Q x £
except that X
to A.
q 0 ) where
d, A,
That
all is
9
is,
as in the
X(q, a) gives the
output
M
associated with the transition from state q on input a. The output of in response to input a 1 a 2 ••• a n is X(q 0 a x )k(q u a 2 )-- A(q n - l9 a n \ where q 0y q u ...,q n is the ,
sequence of states such that <5(g _ l9 has length n rather than length n + f
Mealy machine
gives output
a,) 1
=
q for t
as for the
1
< < i
n.
Note that
this
Moore machine, and on
sequence input
e
a
c.
Example 2.18 Even if the output alphabet has only two symbols, the Mealy machine model can save states when compared with a finite automaton. Consider the language (0 + 1)*(00 + 11) of all strings of 0's and Ts whose last two symbols are the same. In the next chapter we shall develop the tools necessary to show that this language is accepted by no DFA with fewer than five states. However, we may define a three-state Mealy machine that uses its state to remember the last symbol read, emits output y whenever the current input matches the previous one, and emits n otherwise. The sequence of ys and ns emitted by the Mealy machine corresponds to the sequence of accepting and nonaccepting states entered by a DFA on the same input; however, the Mealy machine does not make an output prior to any input, while the
DFA rejects the string
M=
6,
as
its initial
state
is
nonfinal.
q 0 ) is shown in Fig. 2.24. We use the label a/b on an arc from state p to state q to indicate that to input 01100 is nnyny, with the Hp* a) = q and A(p, a) = b. The response of
The Mealy machine
({q0 ,
p 0 pj, ,
{0, 1}, {y, n}, S, A,
M
sequence of states entered being q 0 PoPi Pi PoPo- Note how p 0 remembers a zero and p x remembers a one. State q 0 is initial and "remembers" that no input has yet been received.
FINITE
44
AUTOMATA AND REGULAR EXPRESSIONS
A Mealy
Fig. 2.24
Equivalence of
machine.
Moore and Mealy machines
M be a Mealy or Moore machine. Define TM (w), for input string w, to be the M on input w. There can never be exact identity between the functions TM and TM M a Mealy machine and M' a Moore machine, because Let
output produced by
.
^(w)
if
is
|rM -(w)| for each w. However, we may neglect Moore machine to input e and say that Mealy machine M and Moore machine M' are equivalent if for all inputs w, bTM (w) = TM (w), where b is the output of M' for its initial state. We may then prove the following theorems, equating the Mealy and Moore models. |
one
is
j
than
less
the response of a
Theorem machine
2.6
M
2
If
M
=
x
(Q, E, A, 3, X, q 0 )
equivalent to
is
a
Moore machine, then
there
is
a Mealy
Mv
M
Proof Let 2 = (Q, 1, A, 3, X\ q 0 ) and define X'(q, a) to be X(3(q, a)) for all states and q and input symbols a. Then 2 enter the same sequence of states on the same input, and with each transition 2 emits the output that Mj associates with
M
M M
x
the state entered.
Theorem 2.7 Let Moore machine
M
M
M
=
x
(Q, Z, A, 3, X, q 0 ) be a
equivalent to
2
M
l
=
Mealy machine. Then
there
is
a
.
(Q x A, E, A, 3', X, [q 0 b 0 ]), where b 0 is an arbitrarily selected 2 member of A. That is, the states of 2 are pairs [q, b] consisting of a state of Mj and an output symbol. Define 3'([q y b\ a) = [3(q, a), X(q, a)] and X'([q, b]) = b.
Proof
Let
M
,
M
The second component of a state [q, b] of 2 is the output made by Mi on some transition into state q. Only the first components of 2 's states determine the An easy induction on n shows that if moves made by enters states q 0 2 •• a and emits outputs b input on a a b b x 2 n n l9 ly 2 q n then 2 enters q states [g 0 b 0 ] [q x b^, b„. [q n b n ] and emits outputs b 0 b u b 2
M
,
t
,
Example 2.19 [<7o,
yl
[q<»
,
Let
M
M
.
M
4 bo> yl
x
be the Mealy machine of Fig. [Po, n],
[p u y]
9
and
2.24.
M
,
,
,
,
y
,
,
The
[p u n\ Choose b 0
=
states of «,
making
M
2
are
[g 0 , "]
2.8
APPLICATIONS OF FINITE AUTOMATA
|
45
1
1
Moore machine
Fig. 2.25
M
2 's
The
start state.
transitions
constructed from Mealy machine.
M
and outputs of 2 are shown and may be removed.
in Fig. 2.25.
Note
that state [q 0i y] can never be entered
APPLICATIONS OF FINITE AUTOMATA
2.8
There are a variety of software design problems that are simplified by automatic conversion of regular expression notation to an efficient computer implementation of the corresponding finite automaton. We mention two such applications here; the bibliographic notes contain references to
some other
applications.
Lexical analyzers
The tokens of a programming language are almost without exception expressible as regular sets. For example, ALGOL identifiers, which are upper- or lower-case letters followed by any sequence of letters and digits, with no limit on length, may be expressed as (letter)(letter
where for
0
"letter" stands
+
1
+
••
stricted to
+
9.
identifiers,
(letter)(c
where
"letter"
languages)
now
stands for ($
do not permit
may
+
$,
may
letter
+A+B+
+ b+
a
--
-
+
z,
an(j "digit" stands
with length limit six and letters re-
+ •
be expressed as digit) •
+
5
Z).
SNOBOL
arithmetic con-
the exponential notation present in
many
other
be expressed as (e
A number
digit)*
forA + B + -" + Z +
FORTRAN
upper case and the symbol
stants (which
+
+
-)(digit
+
(- digit*
+
e)
+
•
digit*)
of lexical-analyzer generators take as input a sequence of regular
expressions describing the tokens and produce a single finite automaton recogniz-
AUTOMATA AND REGULAR EXPRESSIONS
FINITE
46
NFA with DFA directly,
ing any token. Usually, they convert the regular expression to an 6-transitions
and then construct subsets of
rather than
eliminating 6-transitions.
first
token found, so the automaton
is
really a
produce a
states to
Each
final state indicates the particular
Moore machine. The transition function
FA is encoded in one of several ways to take less space than the transition would take if represented as a two-dimensional array. The lexical analyzer produced by the generator is a fixed program that interprets coded tables, of the table
together with the particular table that represents the
FA
recognizing the tokens
(specified to the generator in regular expression notation).
This lexical analyzer
may then be used as a module in a compiler. Examples of lexical analyzer generators that follow the
above approach are found
in
Johnson
et
al [1968] and Lesk [1975].
Text editors Certain text editors and similar programs permit the substitution of a string for
any string matching a given regular expression. For example, the editor allows a
command
UNIX
text
such as s/bt>t>*/b/
that substitutes a single blank for the
given all
line.
string of
first
Let "any" denote the expression a t
+
two or more blanks found in a + + a n where the a^s are
a2
•
•
,
of a computer's characters except the "newline" character.
DFA
We could
convert a
Note that the presence of any* allows us to recognize a member of L(r) beginning anywhere in the line. However, the conversion of a regular expression to a DFA takes far more time than it takes to scan a single short line using the DFA, and the DFA could have a number of regular expression r to a
states that
What any*r
is
is
that accepts any*r.
an exponential function of the length of the regular expression.
actually
happens
converted to an
in the
NFA
UNIX
text editor
is
that the regular expression
with (-transitions, and the
NFA
is
then simulated
However, once a column has been constructed listing all the states the NFA can enter on a particular prefix of the input, the previous column is no longer needed and is thrown away to save space. This approach to regular set recognition was first expressed in Thompson [1968]. directly, as suggested in Fig. 2.6.
EXERCISES *S
2.1
Find a
finite
automaton whose behavior corresponds to the
circuit in Fig. 2.26, in the
A circle with a dot represents an AND-gate, only if both inputs have value A circle with a + represents an OR-gate, whenever either input has value A circle with a ~ represents an
sense that final states correspond to a 1-output.
whose output is 1 whose output is 1 inverter, whose output between changes
1.
1.
is
1
for input
0 and 0 for input
in input values for signals to
stable configuration.
1.
Assume
there
is
sufficient
time
propagate and for the network to reach a
EXERCISES
47
Historically, finite automata were first used to model neuron nets. Find a finite automaton whose behavior is equivalent to the neuron net in Fig. 2.27. Final states of the automaton correspond to a 1 -output of the network. Each neuron has excitatory (circles) and inhibitory (dots) synapses. A neuron produces-a 1-output if the number of excitatory synapses with 1-inputs exceeds the number of inhibitory synapses with 1-inputs by at least the threshold of the neuron (number inside the triangle). Assume there is sufficient time between changes in input value for signals to propagate and for the network to reach a stable configuration. Further assume that initially the values of y l9 y 2 and y 3 are all 0. 2.2
»
Consider the toy shown in Fig. 2.28. A marble is dropped in at A or B. Levers x u x 2 and x 3 cause the marble to fall either to the left or right. Whenever a marble encounters a
2.3
,
AUTOMATA AND REGULAR EXPRESSIONS
FINITE
48
D
C
A
Fig. 2.28
lever,
it
toy.
causes the lever to change state, so that the next marble to encounter the lever will
take the opposite branch. a)
Model
this toy
marble
in at
by a finite automaton. Denote a marble in at A by a 0-input and a by a 1 -input. A sequence of inputs is accepted if the last marble comes
B
out at D.
by the finite automaton. Mealy machine whose output is the sequence of Cs and
b) Describe the set accepted c)
Model
the toy as a
which successive marbles
Suppose d
2.4 y, 5(q,
2.5
xy)
a)
c)
d)
e)
* 2.6
is
the transition function of a
d(6(q, x), y). [Hint:
Give deterministic
alphabet
b)
—
The The The two The
finite
ending
of
all
strings
of
all
strings with three consecutive O's.
set of all strings
such that every block of five consecutive symbols contains
set
of all strings beginning with a
1
x and
at least
The
strings such that the 10th
set
of
Describe
all
in
set of strings in (0
some
set of all strings
uated
left
finite
i
1}
right
and
finite
+ 1)* > 0.
is
1.
transition
is
diagrams
given in Fig. 2.30 accepts the set of
number of 0's and Ts, such most one more 1 than 0's.
with an equal
l's
end
automata whose
(c).
FA whose transition diagram
Give nondeterministic
for
5.
symbol from the
English the sets accepted by the
Prove that the
is 4i,
which, interpreted as the binary representation
congruent to zero modulo
is
The
strings
O's.
length b)
any input
in 00.
of an integer,
The
for
|y|.]
automata accepting the following languages over the
set
has at most one more 0 than
a)
DFA. Prove that
Use induction on
set
strings over the alphabet {0,
2.8
out of
{0, 1}.
are given in Fig. 2.29(a) through 'S 2.7
£>'s
fall.
at
all
that each prefix
automata accepting the following languages.
such that some two
over the alphabet
to right as right to left
0's are
{a, b, c} that
separated by a string whose
have the same value when eval-
by multiplying according
to the table in Fig. 2.31.
EXERCISES
Fig. 2.31
Nonassociative multiplication table.
49
AUTOMATA AND REGULAR EXPRESSIONS
FINITE
50
The
c)
set of all strings of O's
How
and
l's
such that the 10th symbol from the right end
does your answer compare with the
DFA
is
a
1.
of Problem 2.5(e)?
Construct DFA's equivalent to the NFA's.
2.9
a) ({p, q,
where 5
r, s}, {0,
Su
1},
b)
p, {s}),
and S 2 are given
1
({p, q, r, s}, {0, 1},
5 2( p,
{q, s})
in Fig. 2.32.
\
0
p
P,
<\
r
r
s
s
5
\
1
P
<\
0
1
P
q, s
r
s
r
r
p
s
s
p
s1
Two
Fig. 2.32
Write regular expressions for each of the following languages over the alphabet
2.10
Provide justification that your regular expression
{0, 1}.
* a)
transition functions.
The
set of all strings
The
set
of
adjacent c) * d)
all
correct. CTs
Describe
a) (11
at
most one pair
which every pair of adjacent
strings in
appears before any pair of
0*s
l's.
The set of all strings not containing 101 as a substring. The set of all strings with an equal number of 0's and Ts such more 0's than l's nor two more l's than 0's.
2.11
and
'
of consecutive Ts. b)
is
with at most one pair of consecutive
in
that
no
prefix has
two
English the sets denoted by the following regular expressions.
+0)*(00+
1)*
+01 + 001)*(< +0 + 00) [00 + 11 + (01 + 10)(00 + 11)*(01 +
b) (1 c)
Construct
2.12 a)
10+ (0+
b) 01[((10)* c)
((
finite
10)]*
automata equivalent to the following regular expressions.
11)0*1
+
111)*
0 +l)(0+l))*
+0]*1
+ ((0+l)(0+l)(0+l))*
2.13
Construct regular expressions corresponding to the state diagrams given
2.14
Use
the ideas in the proof of
Theorem
in Fig. 2.33.
2.4 to construct algorithms for the following
problems. a) Find the lowest-cost path between
two
vertices in a directed
graph where each edge
is
labeled with a nonnegative cost. b) Determine the
2.15
Construct an
given by Fig. 2.34.
number
of strings of length n accepted by an
NFA equivalent to the 2DFA ({g 0
,
q5}
y
FA.
{0, 1}, 6,
q 0 {q 2 }), where 8 ,
is
EXERCISES
\
4r)
Prove the following
a) c)
e)
g)
= L(s). r + 5 = s + r (rs)f = r(sf) (r + 5)f = rf + (r*)* = r*
*) H)
(92.
(43,
L)
(94,
L)
(43,
L)
(*>,
R)
(44,
L)
(
42 43
44
(4.,
transition function for a
identities for regular expressions
+ + = r + (5 + f) + f) = rs + 0* = £ f i) (r*5*)* = + r)* = r* b)
(r
.s)
2DFA.
r, s,
and
f.
Here
r
=
s
means
f
d) r(s
st
)
h)
(£
Prove or disprove the following
2.17
(42,
(4o,
41
A
1
R) R) R)
40
Fig. 2.34
2.16
0
51
(r
+
s)*
for regular expressions
r, s,
and
r.
b) s(r5 + s)*r = rr*5(rr*s)* + r)*r = r(sr + r)* c) (r + s)* = r* + 5* 2.18 A fwo-way nondeterministic finite automaton (2NFA) is defined in the same manner as the 2DFA, except that the 2NFA has a set of possible moves for each state and input symbol. Prove that the set accepted by any 2NFA is regular. [Hint: The observation in the a) (rs
proof of Theorem 2.5 that no state may repeat with the same direction in a valid crossing sequence is no longer true. However, for each accepted input we may consider a shortest
computation leading to acceptance.] 2.19 state)
* 2.20
and
Show that adding the capability of the 2NFA to keep its head stationary (and change on a move does not increase the class of languages accepted by 2NFA.
A 2NFA right
with endmarkers
ends of the input.
is
We
a
2NFA
with special symbols
<^
and $ marking the
left
say that input x, which contains no $ or $ symbols,
is
AUTOMATA AND REGULAR EXPRESSIONS
FINITE
52
accepted
if
the
2NFA
regular
on
started with
an accepting state anywhere on
its
tape and with the tape head scanning $ enters that the 2NFA with endmarkers accepts only
Show
input.
its
sets.
M = (Q
2DFA
For each string x construct a mapping/from on the rightmost symbol of x eventually moves off x to the right, in state p.f(q) = $ means that the 2DFA when started on the rightmost symbol of x either never leaves x or moves off the left end. Construct a DFA Consider a
2.21
Q
to
Q u
{<):},
where f(q)
Let
r
p
and
s
E,
y
(5,
q 0 F). ,
2DFA
the
if
M by storing
which simulates ** 2.22
—
started
in its finite
denotes the concatenation of r and X, and set is
denoted by
the solution
** 2.23
r
does not contain
if
One can
c,
-I-
X = rX +
find the solution for
c,
5,
where rX
denotes union. Under the assumption that the
X and prove that
it is
unique.
What
L(r) contains cl
construct a regular expression from a finite automaton by solving a set of
linear equations of the
where a XJ and
control a table / instead of a crossing sequence.
be regular expressions. Consider the equation
form
+
are sets of strings denoted by regular expressions,
denotes
set
union, and
multiplication denotes concatenation. Give an algorithm for solving such equations. 2.24 a)
Give Mealy and Moore machines
For input from
(0
+
for the following processes:
A
the input ends in 101, output
1)*, if
;
if
the input ends in 110,
output B\ otherwise output C. b)
For input from ternary (base
Solutions to
3,
+
(0
1
+
modulo
2)* print the residue
with digits
0,
1,
5 of the input treated as a
and 2) number.
Sample Exercises
Note that the gate output at y affects the gate output at y 2 and conversely. We shall assume values for y and y 2 and use these assumed values to compute new values. Then we
2.1
x
{
repeat the process with the 2.35
new
values until
we reach a
stable state of the system. In Fig.
tabulated the stable values of yi and y 2 for each possible assumed values for Vi and for input values 0 and 1.
we have
and y 2
Input
0
1
\
0
1
00
00
01
<7o
01
01
11
01
Qi
12
Qi
<]2
12
43
11
11
10
10
00
10 (a)
Fig. 2.35
03 (b)
Transitions of switching circuit.
)
EXERCISES
If
C
gate
53
y x and y 2 are both assumed to have value 0, then gates A and B have output 0 and has output equal to the value of the input x. Since both inputs to gate D are 0, the
output of gate
D is 0. The
output of gate
00 and
Fig. 2.35(a) has entries
01.
E
has the value of the input x. Thus the top row in
The remaining entries
are
computed
in
a similar manner.
We
can model the circuit by assigning a state to each pair of values for y x y 2 This is done in Fig. 2.35(b). Since Vi = y 2 = 1 produces a 1-output, q 2 is a final state. The circuit .
can be seen to record the parity of pulses (1-inputs) and produce an output pulse for every
odd-numbered input
We are asked
2.7
the
pulse.
to prove that a set informally described in English
FA. Clearly we cannot give a completely formal proof.
some formal
that
description of the set
is
equivalent to the English description and then
We
proceed formally or else simply give an informal proof.
The proof consists of deducing is
We say
on the length of a
S(q 0 x)
=
x)
= = =
,
2) S(q 0
,
latter.
automaton
to each
string that our inter-
correct.
that a string
most one more 1)
choose the
the properties of strings, taking the
of the four states, and then proving by induction
pretation
the set accepted by
is
We must either argue intuitively
3) S(q Q , x)
4) 3(q 0 x) ,
1
than
x
proper
is
We
0.
each prefix of x has at most one more 0 than
if
q0
if
and only
if
x
is
proper and contains an equal number of
q
if
and only
if
x
is
proper and contains one more 0 than
l's,
if
and only
if
x
is
proper and contains one more
0's,
q 3 if
and only
if
x
is
not proper.
x
q2
1
and
at
argue by induction on the length of a string x that
than
1
0's
and
l's,
Observe that the induction hypothesis is stronger than the desired theorem. Conditions (2), (3), and (4) are added to allow the induction to go through. We prove the "if" portions of (1) through (4) first. The basis of the induction, |x =0, follows since the empty string has an equal number of 0's and l's and S(q Qy c) = q 0 Assume the induction hypothesis is true for all x, x < n, n > 1. Consider a string y of length n, such that y is proper and has an equal number of 0's and l's. First consider the case that y ends in 0. Then y = xO, where x is proper and has one more 1 than 0's. Thus |
.
|
<%o>
x)=
q2
.
Hence
The case where y ends
y
1
S(q 0j xO)
=
%
0)
2,
=
q0
.
handled similarly.
is
= n such that y is proper and has one more 0 than 1. If x has two more 0's than l's, contradicting the fact that y is proper. Thus xl, where x is proper and has an equal number of 0's and l's. By the induction Next consider a
= =
a
in
=
v)
<5(<7o,
y
|
string y, \y
\
xO, then
hypothesis, d(q 0 x) ,
The
situation
=
q0
',
hence d(q 0 y) = q is proper and has one more x
,
where y
.
1
than
0,
and the situation where y
is
not proper are treated similarly.
We must now show that strings reaching each state have the interpretations given through is
(4).
Suppose that S(q 0
,
y)
=
q Q and \y
1
in
than a
1
0. is
Thus y
is
>
1.
If
y
=
xO, then 3(q 0y x)
=
in
( 1
q 2 since q 2 ,
Thus by the induction hypothesis x is proper proper and has an equal number of 0's and l's. The
the only state with a 0-transition to state q 0
and has one more case where y ends
\
.
similar, as are the cases S(q 0 , y)
= qu
q 2 or q 3 ,
.
FINITE
54
AUTOMATA AND REGULAR EXPRESSIONS
BIBLIOGRAPHIC NOTES The
original formal study of finite state systems (neural nets similar to that appearing in
is by McCulloch and Pitts [1943]. Kleene [1956] considered regular expresand modeled the neural nets of McCulloch and Pitts by finite automata, proving the equivalence of the two concepts. Similar models were considered about that time by Huffman [1954], Moore [1956], and Mealy [1955], the latter two being the sources for the terms "Moore machine" and "Mealy machine." Nondeterministic finite automata were introduced by Rabin and Scott [1959], who proved their equivalence to deterministic automata. The notion of a two-way finite automaton and its equivalence to the one-way variety was the independent work of Rabin and Scott [1959] and Shepherdson [1959]. The proof of the equivalence of regular expressions and finite automata as presented here (via NFA's with ^-transitions) is patterned after McNaughton and Yamada [I960]. Brzozowski [1962, 1964] developed the theory of regular expressions. The fact that the
Exercise 2.2) sions
unique solution to
X = rX + s (Exercise 2.22) is r*s
if
L(r) does not contain
e is
known
as
Arden's [1960] lemma. Floyd [1967] applies the idea of nondeterminism to programs.
Salomaa [1966]
gives axiomatizations of regular expressions.
Applications of
circuit design can be found in Kohavi The use of the theory to design lexical analyzers is treated by and Lesk [1975]. Other uses of finite automata theory to design text
finite
automata to switching
[1970] and Friedman [1975].
Johnson et ai [1968] editors and other text processing programs are discussed in Thompson [1968], Bullen and Millen [1972], Aho and Corasick [1975], Knuth, Morris, and Pratt [1977], and Aho and Ullman [1977]. Some additional works treating finite automata are by Arbib [1970], Conway [1971], Minsky [1967], Moore [1964], and Shannon and McCarthy [1956].
CHAPTER
3
PROPERTIES OF
REGULAR SETS
There are several questions one can ask concerning regular sets. One important question is: given a language L specified in some manner, is L a regular set? We also might want to know whether the regular sets denoted by different regular expressions are the same, or find the finite automaton with fewest states that denotes the same language as a given FA. In this chapter we provide tools to deal with questions such as these regarding regular sets. We prove a "pumping lemma" to show that certain languages are nonregular. We provide "closure properties" of regular sets; the fact that languages constructed from regular sets in certain specified ways must also be regular can be used to prove or disprove that certain other languages are regular. The issue of regularity or nonregularity can also be resolved sometimes with the aid of the Myhill-Nerode Theorem of Section 3.4. In addition, we give algorithms to answer a number of other questions about regular expressions and finite automata such as whether a given FA accepts an infinite language.
3.1
THE PUMPING LEMMA FOR REGULAR SETS
we prove a basic result, called the pumping lemma, which is a powerful tool for proving certain languages nonregular. It is also useful in the In this section
development of algorithms to answer certain questions concerning finite automata, such as whether the language accepted by a given FA is finite or infinite. = (Q, X, 6, q 0 F) with If a language is regular, it is accepted by a DFA some particular number of states, say n. Consider an input of n or more symbols
M
a i^2"' a m
,
m>n,
and
for
/=
1,
2,...,
55
m
let
d(q 0
,
a a 2 '" x
,
fl ) f
=
q
. x
It
is
not
PROPERTIES OF REGULAR SETS
56
+
possible for each of the n
Thus
n different states. qi
=
qk
The path
.
in Fig. 3.1.
length
its
is
,
labeled a x a 2
<
Since j
two
no more than
..
.
.
a m in the transition diagram of
• •
'
the string a j+
/c,
be distinct, since there are only , q n to integers j and k, 0 < j < k < n, such that
states q 0 q x ,
1
there are
•
•
ak
•
x
of length at least
is
1,
M
is
illustrated
and since k
<
n,
n.
Path
Fig. 3.1
DFA
diagram of
in transition
M.
If q m is in F, that is, a a 2 '" a m is in L(M), then a x a 2 "' a a k + a k + 2 "' a m ls } also in L(M), since there is a path from q 0 to q m that goes through qj but not x
x
around the loop labeled a j+i •• a k Formally, by Exercise
2.4,
.
aja k+l
0i
(5(<7o,
=
am )
-
S(S(q 0
a
,
a k+1
x
•••
am )
—
we could go around the loop of Fig. 3.1 more than once in fact, as we like. Thus, a •• cij(a j+x a m is in L(M) for any > 0. What we have proved is that given any sufficiently long string accepted by an FA, we can find a substring near the beginning of the string that may be "pumped," i.e., repeated as many times as we like, and the resulting string will be accepted by the FA. The formal statement of the pumping lemma follows. Similarly,
many
•
•
times as
x
z
Lemma word \v
\
>
and and for
1,
number
•••
all
a regular
>
n,
>
i
0,
uv'w
Then
set.
we may
of states of the smallest
am u ,
x
Note z,
L be \z\
there
write z
is
L.
in
FA
is
= uvw
a constant n such that
such a way that
in
Furthermore, n
z
uv
is
\
any
<
then
it
=
a x a2
-
• a p
v
=
aj+l
pumping lemma
that the
z.
•
a k and ,
states that
if
w=
a k+
+
•
t
is
.
is left
The lemma does not w for some in which no substring
zzzj'w.
is
of the form uv
1)* contains arbitrarily long strings
appears three times consecutively. (The proof
am
a regular set contains a long string
contains an infinite set of strings of the form
In fact, (0
n,
no greater than the
accepting L.
state that every sufficiently long string in a regular set
large
is
if
|
See the discussion preceding the statement of the lemma. There, z
Proof a a2
Let
3.1
in L,
as an exercise.)
l
3.1
THE PUMPING LEMMA FOR REGULAR SETS
I
57
pumping lemma
Applications of the
The pumping lemma is extremely useful in proving that certain sets are not regular. The general methodology in its application is an "adversary argument" of the following form. 1)
Select the language
2)
The "adversary"
L you
picks
wish to prove nonregular.
the constant mentioned in the
n,
pumping lemma. You
must be prepared in what follows for any finite integer once the adversary has picked n, he may not change it.
Your choice may depend
3) Select a string z in L.
chosen 4)
5)
<
wi) |
You and
implicitly
on the value of n
in (2).
The adversary breaks |
n to be picked, but
n and
|
v
>
\
z into
i/,
and
i\
w, subject to the constraints that
1.
pumping lemma by showing, for any t/, r, determined by the adversary, that there exists an i for which ur'vv is not
achieve a contradiction to the
w
in L. It
may
depend on It is
L
then be concluded that
m, u, v,
and
not regular. Your selection of
is
/'
may
w. in the above "game" corresponds to and the "adversary's" choices correspond
your choice
interesting to note that
the universal quantifiers (V, or "for all")
to the existential quantifiers (3. or "there exists") in the formal statement of the
pumping lemma: (VL)(3//)(Vz)[z in
L and
|z|
>
(3i/, r,
Example 3.1 The of O's whose length be the integer be written as
~
set
L=
implies that
n
=
t/rvv,
is
an integer,
n)(z
2
{0'
i 1
a perfect square,
Example
1],
1
and
(V/)(i/r'\v is in L))].
is
which consists of
all
strings
not regular.
2
/'.
1
a contradiction.
3.2
i>
>
.
2.
in L,
|r|
//,
2
,
not
<
Assume L is regular and let n By the pumping lemma, 0" may in the pumping lemma. Let z = 0" < |r| < n and uv'w is in L for all In particular, let i/rw, where is
However, n 2 < \uv 2 w\ < n 2 -f n properly between n 2 and (n -f l) 2 and i
\ur\
Let
L be
We
(n -h l)
is
thus not a perfect square.
conclude that
L
the set of strings of CVs
value treated as a binary
number
is
2
<
a prime.
is
.
That
the length of uv
2
\v lies
Thus ur 2 \v
is
not regular.
and
We
is,
l's.
shall
1, whose pumping from number theory.
beginning with a
make
use of the
lemma to prove that L is not regular. We need two results The first is that the number of primes is infinite and that there are therefore — is divisible by arbitrarily large primes. The second, due to Fermat, is that 2 P ~ ~ = mod p (see Hardy and P for any prime p > 2. Stated another way, 2 P 1
1
1
1
Wright [1938]).
PROPERTIES OF REGULAR SETS
58
Suppose
L were
and
regular,
many
since there are infinitely z
=
uvw, where
|
v
>
\
l
>
are
£,
then n u or n WJ respectively,
representation of a prime
q.
„ m2M + pM +
1?
|,
Let
5
=
1
+ 2H
4-
•
•
•
p\v\
is
"
=
1]
+
p. If
DM2M =
2 (p-
2
we
(
raise
mod
2 |y|
M)
nw
4-
.
both sides to the power
p.
Then
-
|y|
=
l)s
2 pM
H - mod p. Thus (2 - l)(s - 1) is 2 < 2 < 2 M < p. Therefore p cannot divide = mod p. But
which
|y|
1
|y|
so 2
i.
Thus
p.
=
mod
1
all
i
1)
(2
5
0.
is
'-
2 {p ~
+
write
w treated as binary numbers. If u or w Choose = p. Then uv pw is the binary
„ y2 H(i + 2M +
By Fermat's theorem, 2 we get 2 (P_1)M = 1 mod 2
pumping lemma we may
The numerical value of q
(p
1
exists
and
u, v,
,
the
z
Such a prime
2".
the binary representation of a prime for
is
Let n u n v and n w be the values of ,
By
primes.
and uv w
1
pumping lemma. Let
n be the integer in the
let
be the binary representation of a prime p such that p
-
1,
divisible
is
-
2 |y|
1,
so
by it
But
p.
< \v\
1
divides s
1
7
=
q
=
<
« u 2H
+ pM
+
„
i;
2H s +
w>v ,
so M M 2'
w + l
y
l
'
+
« y 2»
w »
+
nw
mod
(3.1)
p.
=
But the right-hand side of (3.1) is the numerical value of p. Thus q p mod p, which is to say q is divisible by p. Since q > p > 1, q cannot be prime. But by the pumping lemma, the binary representation of q is in L, a contradiction. We conclude that
3.2
L
is
not regular.
CLOSURE PROPERTIES OF REGULAR SETS
There are many operations on languages that preserve regular
sets, in
that the operations applied to regular sets result in regular sets.
For example, the
union of two regular
sets
denoting regular
^
sets
is
a regular
and L 2 then ,
set,
rt
since
+
r2
if
rx
regular. Similarly, the concatenation of regular sets
closure of a regular set
is
and
denotes is
the sense
r 2 are regular expressions
L u L2 Y
a regular
,
so
L u L2
set
and the Kleene
x
is
also
regular.
If a class of languages is closed
under a particular operation, we
closure property of the class of languages.
call that fact
a
We are particularly interested in effective
closure properties where, given descriptors for languages in the class, there
is an algorithm to construct a representation for the language that results by applying
the operation to these languages.
For example, we
just
gave an algorithm to
3.2
CLOSURE PROPERTIES OF REGULAR SETS
|
59
construct a regular expression for the union of two languages denoted by regular expressions, so the class of regular sets
properties given in this
book are
under union. Closure
effectively closed
is
effective unless otherwise stated.
should be observed that the equivalences shown
in Chapter 2 between the automata and regular expressions were effective equivalences, in the sense that algorithms were given to translate from one representation to another. Thus in proving effective closure properties we may choose the It
various models of
finite
representation that suits us best, usually regular expressions or deterministic
automata.
We now
finite
consider a sequence of closure properties of regular sets;
additional closure properties are given in the exercises.
Theorem
The
3.1
regular sets are closed under union, concatenation, and Kleene
closure.
Immediate from the
Proof
definition of regular expressions.
Boolean operations
Theorem if
L
is
The
3.2
a regular
set
class of regular sets
and
Lc
Proof Let L be L(M) for DFA may assume Z = Z, for if there 1
M on symbols not
transitions of
state"
d into
in
closed under complementation. That
M = (Q, Z
lf
is
a regular
3,
are symbols in
in Z.
not thereby change the language of
none of these symbols appear
is
Z* — L
X*, then
The
M.
all
a
in
symbols
We may
Z and
L c Z* we may
First,
we
delete all
L ^ Z* assures us that we shall
If there are
words of L
M with S(d, a) = d for
q 0t F) and let Z { not in Z,
fact that
is,
set.
in
Z
not
in
Z 1?
then
therefore introduce a "dead
S(q, a)
=
d
for all q in
Q and a
in
Z-Zp Now, is,
<5,
<7o>
w
is
Q~
in
without
Z*
—
Z*
Th en M'
L.
L,
accepts a
,
M
moves.
e
Theorem
F)-
—
complement the final states of M. That is, let M' = (Q, word w if and only if d(q 0 w) is in Q - F, that is deterministic and Note that it is essential to the proof that
to accept
3.3
The
L n L2 =
regular sets are closed under intersection.
u L2
where the overbar denotes complementation with L and L 2 Closure under intersection then follows from closure under union and complementation. Proof
x
L,
,
respect to an alphabet including the alphabets of
.
t
It is worth noting that a direct construction of a DFA for the intersection of two regular sets exists. The construction involves taking the Cartesian product of states, and we sketch the construction as follows. Let Mj = (Q lJ Y. 8 l9 q u F 1 )andM 2 = (6 2 Z,5 2j q 2 ,F 2 ) be two deterministic finite automata. Let »
9
M = (Qt
x
Q 29
Z,
5, [q l9
q 2 \ F, x
F2
),
PROPERTIES OF REGULAR SETS
60
where
Q
for all p x in
Q 2 and
in
p2
l9
,
shown
It is easily
Substitutions and
The
=
p 2 ], a)
<5([Pi,
a
in E,
[
£(M) =
that
4
n £(M 2
2(P 2 » a)].
<5
).
homomorphisms
class of regular sets
has the interesting property that
closed under substi-
it is
For each symbol a in the alphabet of some regular set R, let R a be a particular regular set. Suppose that we replace each word a a 2 "' a n m R °y tne set °f words of the form Wj w 2 ••• w„, where w is an arbitrary word in jR a Then the result is always a regular set. More formally, a substitution/ is a mapping of an alphabet S onto subsets of A*, for some alphabet A. Thus / associates a language with each symbol of S. The mapping/is extended tution in the following sense.
i
f
..
to strings as follows:
=
1)
/(<)
2)
f(xa)=f(x)f(a).
<;
The mapping/ is extended
to languages by defining
f(L)=
1J /(*)• x
Example is
Let/(0)
3.3
the language of
language 0*(0
+
Theorem
The
Proof set.
3.4
Let
Let /:
->
sions denoting
a and/(l)
=
b*.
L
That
is,/(0)
Then /(010)
is
is
the language {a} and/(l)
the regular set ab*a. If
=
a*(a
+
b*)(b*)*
class of regular sets
is
closed under substitution.
then/(L)
1)1*,
R c X*
Z
=
strings of Vs.
all
in
be a regular
A* be
R and
is
set
and
for each a in
the substitution defined by f(a)
S
let
= Ka
.
the
R a c A*
be a regular
Select regular expres-
R a Replace each occurrence of the symbol R by the regular expression for R a To prove
each
regular expression for
L is
a*b*.
.
.
a in the that the
resulting regular expression denotes/(K), observe that the substitution of a union,
product, or closure
is
the union, product, or closure of the substitution. [Thus, for
example, f(L u L 2 )=/(L ) vj /(L 2 ).] A simple induction on the number of operators in the regular expression completes the proof. 1
{
Note 0*(1
+
that in
Example
3.3
we computed f(L) by taking L's 1. The fact
0)1* and substituting a for 0 and b* for
regular expression
is
regular expression that the resulting
equivalent to the simpler regular expression a*b*
is
a
coincidence.
A
type of substitution that
homomorphism h
is
is
of special interest
is
the
homomorphism. A
a substitution such that h(a) contains a single string for each
a.
3.2
We
CLOSURE PROPERTIES OF REGULAR SETS
|
generally take h(a) to be the string
We
rather than the set containing that
itself,
homomorphic image of a language L to be
string. It is useful to define the inverse
= {x\h(x) is
h~ l (L)
61
inL}.
also use, for string w;
h~ 1 (w)
Example
Let h(0)
3.4
(01)*, then ^(Lj)
the string
1.
To
is
=
=
{x\h(x)
aa and h(\)
(aaaba)*. Let
L2 =
=
(ab
=
aba.
+
Then /i(010) = aaabaaa. If Lj is Then h~ (L 2 ) consists only of
x of O's and nonempty and w is
l's,
since h(0)
1
ba)*a.
see this, observe that a string in
h(x) for any string
w}.
L2
that begins with b cannot be
and /i(l)each begin with an
Thus
a.
if
h~ 1 (w) is in L 2 then w begins with a. Now either w = a, in which case h~ (w) is surely empty, or w is abW for some w' in (ab + ba)*a. We conclude that every word in h~ 1 (w) begins with a 1, and since h(\) = aba, must _1 begin with a. If w' = a, we have w = aba and /z (w) = {1}. However, if ± a, = abW and hence w = ababW But no string x in (0 + 1)* has h(x) beginthen ning abab. Consequently we conclude that h~ 1 (w) is empty in this case. Thus the ,
l
W W
W
.
only string in
r'(L2 ) =
L 2 which
has an inverse image under h
is
aba,
and therefore
{i}.
Observe that h(h 1 (L2 )) = {aba} j= L 2 On the other hand, c L and h~ l (h(L)) 3 L for any language L. .
it is
shown
easily
l
that h(h~ (L))
Theorem 3.5 The class of regular homomorphisms.
sets
is
closed under
homomorphisms and
in-
verse
Proof
Closure under homomorphisms follows immediately from closure under homomorphism is a substitution, in which h(a) has one
substitution, since every
member.
To show
closure under inverse
homomorphism,
let
M = (Q, Z,
3,
q 0 F) be a ,
DFA accepting L, and let h be a homomorphism from A to Z*. We construct a DFA M' that accepts h~ (L) by reading symbol a in A and simulating M on h(a). 1
Formally,
let
M' =
Note
<5(g, /i(a)).
extension.
It
is
(Q, A, 3\ q 0 , F) and define d'(q, a), for q in g and a in A to be that h(a) may be a long string, or £, but 3 is defined on all strings by
easy to
show by induction on
Therefore M' accepts x h- l {HM)).
if
and only
if
M
|x|
that 3'{q 0 , x)
accepts h(x). That
=
3(q 0 is,
,
h(x)).
L(M')
=
"
Example 3.5 The importance of homomorphisms and inverse homomorphisms comes in simplifying proofs. We know for example that {0"P n > 1} is not regular. n n Intuitively, {a ba \n> 1} is not regular for the same reasons. That is, if we had an n FA on accepting {a"ba \n> 1}, we could accept {0T \n> 1} by simulating input a for each 0. When the first 1 is seen, simulate on ba and thereafter simulate on a for each 1 seen. However, to be rigorous it is necessary to |
M
M
M
M
PROPERTIES OF REGULAR SETS
62
formally prove that {a"ba n n \
{a
n
ba
n |
> 1}
n
regularity.
Let
/ij
>
is done by showing that by use of operations that preserve
not regular. This
1} is
can be converted to {CI" « |
>
1}
Thus {a"ba" > 1} cannot be regular. and h 2 be the homomorphisms \
h l (a)
=
a,
h 1 (b)
=
ba
h t (c)
=
a,
ba n n
>
1})
h 2 (a)
= 0,
h 2 (b)
=
1,
h 2 (c)
=
1.
a*bc*)
=
{0
i
Then l
h 2 (hi
That
1
is,
number
({a
n
ba
n
n
\
({a
>
n
1}) consists
b.
n
is
n
\n
n \
n
>
1})
n
=
a*bc*
>
1}
it
+
n |
>
n
c)*b(a
1}.
(3.2)
4-
c)* such that the
{a"bc"-
1 1
>
n
1}.
homomorphism
/i
2
.
were regular, then since homomorphisms, inverse homomor-
phisms, and intersection with a regular set regular,
l
one greater than the number of symbols
then follows immediately by applying
If {a"ba
n
Thus h~x \{a ba
(3.2)
n
of all strings in (a
of symbols preceding the b
following the
Line
\
would follow
that {0
n
r|« >
1} is
all
preserve the property of being
regular, a contradiction.
Quotients of languages
Now
let
section.
us turn to the
A number
last
closure property of regular sets to be proved in this
of additional closure properties are given in the exercises.
Define the quotient of languages
L and L2 x
{x there exists y in |
L2
,
written
L /L 2
such that xy
x
is
in
,
to be
L
x
).
L 2 be 10*1. Then L x /L 2 is empty. Since every y and every string xy which is in L Y can have only one 1, there is no x such that xy is in Lj and y is in L 2 Let L3 be 0*1. Then L /L 3 is 0*, since for any x in 0* we may choose y = 1. Clearly xy is in L x = 0*10* and y is in L 3 = 0*1. Since words in L and L 3 each have one 1, it is not possible that words not in 0* are in L /L 3 As another example, L 2 /L 3 = 10*, since for each x in 10* we may again choose y = 1 from L 3 and xy will be in L 2 = 10*1. If xy is in L 2 and y is in L 3 then evidently, x is in Example
in
L2
3.6
has two
Let Lj be 0*10* and
l's
.
x
x
{
.
,
10*.
Theorem
3.6
The
class of regular sets
sets.f
t In this
theorem the closure
is
not
effective.
is
closed under quotient with arbitrary
3.3
|
M
DECISION ALGORITHMS FOR REGULAR SETS
63
= (Q, Z, S, q 0 F) be a finite automaton accepting some regular set Proof Let R, and let L be an arbitrary language. The quotient R/L is accepted by a finite automaton M' = (Q, Z, S, q 0 F), which behaves like except that the final states of M' are all states q of such that there exists y in L for which d(q, y) is in F. Then S(q 0 x) is in F' if and only if there exists y such that (5(g 0 xy) is in F. Thus ,
M
,
M
,
M'
>
accepts #/L.
One
should observe that the construction
in
Theorem
3.6
is
different
from
all
L is an arbitrary set, there may be no algorithm to determine whether there exists y in L such that S(q, y) is in F. Even if we restrict L to some finitely representable class, we still may other constructions in this chapter in that
it is
not
not have an effective construction unless there existence of such a y. In effect
M with F' as the
such that
set
we
effective.
is
Since
an algorithm to
of final states accepts
test for
the
some
F'
R/L However, we may not
be
are saying that for any L, there
is
surely
which subset of Q should be chosen as F'. In the next section we shall a regular set, we can determine F, so the regular sets are effectively closed under quotient with a regular set. able to
tell
see that
if
L is
DECISION ALGORITHMS FOR REGULAR SETS
3.3
important to have algorithms to answer various questions concerning regular The types of questions we are concerned with include: is a given language empty, finite, or infinite? Is one regular set equivalent to another? and so on. It is
sets.
we can establish the existence of algorithms for answering such questions we must decide on a representation. For our purposes we shall assume regular sets Before
are represented by finite automata.
We
could just as well have assumed that some other notation, since
regular sets were represented by regular expressions or
from these notations into finite automata. However, one can imagine representations for which no such translation algorithm exists, and for such representations there may be no algorithm to determine whether or not a particular language is empty. The reader at this stage may feel that it is obvious that we can determine whether a regular set is empty. We shall see in Chapter 8, however, that for many interesting classes of languages the question cannot be answered. there exist mechanical translations
Emptiness, finiteness, and infiniteness
Algorithms to determine whether a regular set is empty, finite, or infinite may be based on the following theorem. We shall discuss efficient algorithms after presenting the theorem.
Theorem
3.7
The
set
of sentences accepted by a
finite
automaton
M with n states
is:
1)
nonempty than
n.
if
and only
if
the finite automaton accepts a sentence of length less
PROPERTIES OF REGULAR SETS
64
2) infinite
<
n
Thus a
and only
if
<
£
if
the
automaton accepts some sentence of length
where
In.
is an algorithm to determine whether a finite automaton accepts zero, number, or an infinite number of sentences.
there
finite
Proof 1)
M
The "if" portion is obvious. Suppose accepts a nonempty set. Let w be a word as short as any other word accepted. By the pumping lemma, w < n for if w were the shortest and w > n then w = uvy and uy is a shorter word in the
y
|
|
9
|
|
f
language.
n < |w| < In, then by the pumping lemma, L(M) is w = w w 2 w 3 and for all w w 2 w 3 is in L Conversely if L(M) is infinite, then there exists w in L(M), where w > n. If w < 2n, we are done. If no word is of length between n and In — 1, let w be of length at
2) If
w
L(M) and
in
is
That
infinite.
is,
i,
,
x
x
|
but as short as any
least 2n,
word
in
L(M) whose we can
< w2 < |
|
n and Wj
2n or more, or
w, |
In part
of length
(1),
up
word of
It
is
(2),
in
L(M )."
is
is
greater than or
w = w w2 w3 t
with
L(M). Either w was not a shortest word of length between n and 2n — 1, a contradiction in either case.
Clearly there
is
L(M) is empty
and In
guaranteed to
—
1
is
in
is:
"See
such a procedure that
the algorithm to decide whether
length between n
procedure that
is |
write
in
the algorithm to decide whether
to n
to halt. In part
w3
w3
|
|
length
equal to In. Again by the pumping lemma, 1
|
L(M)
is
is
infinite is:
if
any word
guaranteed "See
L(M)." Again, clearly there
is
if
any
such a
halt.
should be appreciated that the algorithms suggested
highly inefficient. However, one can easily test whether a
in
Theorem
3.7 are
DFA accepts the empty
by taking its transition diagram and deleting all states that are not reachable on any input from the start state. If one or more final states remain, the language is nonempty. Then without changing the language accepted, we may delete all states that are not final and from which one cannot reach a final state. The DFA accepts an infinite language if and only if the resulting transition diagram has a cycle. The same method works for NFA's, but we must check that there is a cycle labeled by something besides c. set
Equivalence
Next we show same set.
that there
is
Theorem
There
an algorithm to determine
an algorithm to determine
if
two
finite
automata accept
the
3.8
equivalent
(i.e., if
M
is
they accept the
M
if
two
finite
automata are
same language).
and Proof Let 2 be FA accepting L x and L 2 respectively. By Theorems 3.1, and 3.3, (L x n L 2 ) u (Lj n L 2 ) is accepted by some finite automation, 3 It
3.2,
x
,
M
.
3.4
is
M
3 accepts a word if and only if L x an algorithm to determine if L t = L 2
easy to see that there
3.7,
THE MYHILL-N ERODE THEOREM
|
is
L2
.
65
Hence, by Theorem
.
THE MYHILL-NERODE THEOREM AND MINIMIZATION OF
3.4
AUTOMATA
FINITE
Recall from Section 1.5 our discussion of equivalence relations and equivalence classes.
We may
relation
RL
yz is
in L. In the
may
be fewer
always
L a natural equivalence xR L y if and only if for each z, either both or neither of xz and
associate with an arbitrary language
namely,
;
worst case, each string
is
in
an equivalence class by
L
finite if
is
a regular
itself,
(number of equivalence
classes. In particular, the index
but there
classes)
is
set.
There is also a natural equivalence relation on strings associated with a finite = (Q, £, S, q 0i F) be a DFA. For x and y in Z* let xR M y if and automaton. Let only if S(q 0 x) = S(q 0 y). The relation R M is reflexive, symmetric, and transitive, since " = " has these properties, and thus R M is an equivalence relation. R M divides the set X* into equivalence classes, one for each state that is reachable from q 0 In addition, if xR M y, then xzR M yz for all z in £*, since by Exercise 2.4,
M
,
,
.
6(q 0 , xz)
An
=
equivalence relation
=
S(S(q 0y x), z)
S(S(q 0y
y), z)
=
S(q Qy yz).
R such that xRy implies xzRyz is said to be right invariant
(with respect to concatenation).
We see that every finite automaton induces a right R M was defined, on its set of input
invariant equivalence relation, defined as
This result
strings.
Theorem
3.9
formalized in the following theorem.
is
My
(The
hill-
N erode
theorem).
The
following three statements are
equivalent: 1)
2)
L^
The
set
L
the union of
is
is
alence relation of
accepted by some
some
of the equivalence classes of a right invariant equiv-
finite index.
RL
3) Let equivalence relation
xz
is
in
automaton.
finite
L exactly when
yz
be defined by:
is
in L.
xR L y
Then R L
is
if
and only
if
for all z in X*,
of finite index.
Proof (1)
- (2)
Assume
that
L
is
be the equivalence relation invariant since, for
index of
more,
H%y
RM
is finite,
any
z,
DFA
accepted by some
xR M y if
if
and only
d(q 0y x)
since the index
is
= S(qQy at
if
y),
M = (Q
S(q Qy x)
=
y
£,
then S(q Qy xz)
;
y).
=
most the number of states
RM
is
The
S(q Qy yz). in
RM
right
Q. Further-
L
is
the union of those equivalence classes that include a string x such that
*)
is
in
F
y
that
is,
the equivalence classes corresponding to final states.
We show that any equivalence relation E satisfying (2) (2) - (3) RL
q 0y F). Let
(5,
S(q Qy
that
class of
every equivalence class of
is,
R L Thus .
the index of
E
is
R L cannot
entirely contained in
is
a refinement of
some equivalence
be greater than the index of
E and
so
is
PROPERTIES OF REGULAR SETS
66
Assume
finite.
that xEy.
Then
E
since
right invariant, for each z in X*, xzEyz,
is
and thus yz is in L if and only if xz is in L Thus xR L y, and hence the equivalence class of x in £ is contained in the equivalence class of x in R L We conclude that .
each equivalence class of
in
ywz
E*.
is
Now
let
is
= wz
L. Let v
in
Q be the
would have obtained In particular,
= [ya]. M' = (Q\ Z, [xa]
if
[x]
Example 3.2.
is
if
Let
z
^
3.7
= =
RM
az\ xaz' £ [ ]
=
xwz
is
is
in
let
But
L
.
let w L exactly when for any v, xv is in L
is
in
right invariant.
R L and [x] the element of is consistent, since R L is
instead of x from the equivalence class
[ya].
[x],
we
xR L y, so xz is in L exactly when yz is in when
exactly
F = {[x]
|
x
is
x)
=
be the language 0*10*.
L
is
yaz'
in [x],
L}.
is
so
in L,
The
xaR L ya, and
finite
and thus x
is
in
automaton if and
L(M')
F.
Let
L
RM
defined by
M. As
accepted by the all states
Ca =
(00)*,
Cd =
(00)*01,
Cb =
(00)*0,
Ce =
0*100*,
C =
(00)*1,
Cf =
0*10*1(0
relation
RL
for
L
x and y each have no
has xi? L y
if
4-
,
,
Ts,
o
Fig. 3.2
1)*
C c C d and C e and only if either
DFA
M accepting L.
.
DFA M of Fig.
are reachable from the
has six equivalence classes, which are
the union of three of these classes,
The i)
an d
KL
\xa\ The definition
d'(q'0>
c
is
=
a)
to prove that
accepts L, since
<5',
in
(5'([x],
Consider the relation
start state,
L
z,
of equivalence classes of
finite set
Had we chosen y
right invariant.
only
any
L But since x# L y, we know by definition of R L that
when yv
Q' containing x. Define S'([x], a)
L
for
RL
Suppose xR L y, and
invariant.
is,
in
exactly
contained within some equivalence class of
is
We must first show that R L is right We must prove that xwR L yw; that
(3) -> (1)
be
E
C 3.4
Fig. 3.3
ii) iii)
Diagram showing
x and y each have one
THE MYHILL-N ERODE THEOREM
|
RM
is
a refinement of
RL
67
.
or
1,
x and y each have more than one
1.
if x = 010 and y = 1000, then xz is in L if and only if z is in 0*. But yz under exactly the same conditions. As another example, if x = 01 and y = 00, then we might choose z = 0 to show that xR L y is false. That is, xz = 010 is in L, but yz = 000 is not. We may denote the three equivalence classes of R L by C t = 0*, C 2 = 0*10*, and C 3 = 0*10*1(0 -f 1)*. L is the language consisting of only one of these classes, C 2 The relationship of C 7 to C l9 C 2 and C 3 is illustrated in Fig. 3.3. For example C a u C b = (00)* + (00)*0 = 0* = C v
For example,
is
in
L
.
fl
,
,
From K L we may construct a DFA as follows. Pick representatives for C i9 C 2 C 3 say e> 1, and 1 1. Then let M' be the DFA shown in Fig. 3.4. For example, 0) = [1], since if w is any string in [1] (note [1] is C \ say O'lO then wO is ,
and
,
7
,
x
0*10^ \ which
is
also in
C,
=
0*10*.
Fig. 3.4
Minimizing
finite
is
an essentially unique
Theorem 3.10
Theorem
3.9.
among other consequences, minimum state DFA for every
has,
The minimum
an isomorphism
DFA M\
automata
The Myhill-Nerode theorem there
The
(i.e.,
state
automaton accepting a
a renaming of the states) and
is
the implication that
regular
set
given by
L is M'
set.
unique up to proof of
in the
PROPERTIES OF REGULAR SE1
68
Proof
In the proof of
L
accepting
Theorem
U
3.9
we saw
that
defines an equivalence relation that
any is
DFA
M = (Q, Z,
a refinement of
M
d,
q 0 F) ,
R L Thus .
the
number of states of is greater than or equal to the number of states of M' of Theorem 3.9. If equality holds, then each of the states of can be identified with one of the states of M'. That is, let q be a state of M. There must be some x in Z*, such that S(q 0 x) = q, otherwise q could be removed from Q, and a smaller
M
,
automaton found. be consistent. are in the
A
Identify q with the state d'(q'09 x), of M'. This identification will
If S(q Qy x)
=
S(q 0 y) = q, then, by the proof of Theorem class of R L Thus 5'(q'0y x) = d'(q'0 , y). ,
same equivalence
3.9,
x and y
.
minimization algorithm
M
is a simple method for finding the minimum state DFA of Theorems 3.9 = (Q, Z, S, q 0 F). Let = be the equivalence and 3.10 equivalent to a given DFA relation on the states of such that p = q if and only if for each input string x, S(p, x) is an accepting state if and only if S(q x) is an accepting state. Observe that there is an isomorphism between those equivalence classes of = that contain a state reachable from q 0 by some input string and the states of the minimum state FA M'. Thus the states of M' may be identified with these classes. Rather than give a formal algorithm for computing the equivalence classes of = we first work through an example. First some terminology is needed. If p = q, we say p is equivalent to q. We say that p is distinguishable from q if there exists an x such that S(p> x) is in F and S(q, x) is not, or vice versa.
There
M
,
M
y
Example
3.8
Let
M
be the
finite
automaton of
Fig. 3.5. In Fig. 3.6
constructed a table with an entry for each pair of states.
An
X
is
placed
each time we discover a pair of states that cannot be equivalent. placed
in
(c
X
in the entries (a, c),
(fr,
c), (c, d), (c, e),
h).
Fig. 3.5
Finite automaton.
we have the table
Initially
each entry corresponding to one final state and one nonfinal
example, we place an
in
an
state.
(c,/),
(c,
X
is
In our
g\ and
3.4
THE MYHILL-NERODE THEOREM
|
69
X X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
c
(1
e
f
X a
b
X X
X g
Calculation of equivalent
Fig. 3.6
states.
Next
for
each pair of states p and q that are not already
guishable
we
consider the pairs of states r
=
S(p a) and s y
=
known
to be distin-
S(q a) for each input y
symbol a. If states r and s have been shown to be distinguishable by some string x, then p and q are distinguishable by string ax. Thus if the entry (r, s) in the table has an X, an X is also placed at the entry (p, q). If the entry (r, s) does not yet have an X, then the pair (p, q) is placed on a list associated with the (r, s)-entry. At some future time, if the (r, 5) entry receives an X then each pair on the list associated with the (r, s)-entry also receives an X. Continuing with the example, we place an X in the entry (a, b), since the entry (S(b, 1), d(a, 1)) = (c f) already has an X. Similarly, the (a, d)-entry receives an X since the entry (S(a, 0), 3(d, 0)) = (b, c) has an X. Consideration of the (a, e)-entry on input 0 results in the pair (a, e) being placed on the list associated with (6, h). Observe that on input 1, both a and e go to the same state / and hence no string starting with a 1 can distinguish a from e. Because of the 0-input, the pair (a, g) is placed on the list associated with (b, g). When the (fr, gr)-entry is considered, it receives an X on account of a 1 -input, and hence the pair (a, g) receives an X since it was on the list for (b, g). The string 01 distinguishes a from g. y
y
On =
are a
completion of the table e,
b
=
and d
h,
= f. The
in Fig. 3.6,
we conclude
minimum-state
finite
that the equivalent states
automaton
is
given in Fig.
3.7.
The formal algorithm 3.8.
Lemma
3.2
for
marking pairs of inequivalent
states
is
shown
proves that the method outlined does indeed mark
all
in Fig.
pairs of
inequivalent states.
Lemma q
if
3.2
and only
procedure.
Let if
M = (Q, I, d
y
q 0y F) be a
DFA. Then
the entry corresponding to the pair
(p,
p
q)
is is
distinguishable from
marked
in the
above
PROPERTIES OF REGULAR SETS
70
Start
Minimum
Fig. 3.7
state finite
automaton.
begin
F and
Q—F
do mark
1)
for p in
2)
for each pair of distinct states (p, q) in if for
3)
in
q
some input symbol
(p, q);
F
x
F
or (Q
a, (S(p> a), S(q> a)) is
-
F) x (Q
—
F) do
marked then
begin 4)
mark
5)
recursively
(p, q)\
mark
all
unmarked pairs on the list marked at this step.
for (p, q)
and on the
lists
of other pairs that are
end
no
else /*
pair
(<5(p, a),
5(q, a))
put
(p,
symbols a do 4) on the list for
<5(p,
a)
=
is
marked
*/
for all input
6) 7)
(<5(p,
a\
S(q, a)) unless
d(q, a)
end
Algorithm
Fig. 3.8
Assume p
Proof
guishing p from
marking
pairs of inequivalent states.
distinguishable from q and let x be a shortest string distinprove by induction on the length of x that the entry
is
q.
for
y
We
corresponding to the pair (/?, q) is marked. is a final state and hence the entry is marked
If
x
=
c
then exactly one of p and q Assume that the hypothesis
in line (1).
Write x = ay and let t = S(p, a) and |x| = from u and \y\ = — 1. Thus by the induction hypothesis the entry corresponding to the pair (r, u) eventually is marked. If this event occurs after the pair (p, q) has been considered, then either the (p, q) entry is
u
> 1, and let < Now y distinguishes
true for |x|
=
S(q
y
a).
i
has already been marked list
associated with
sidered after
the entry
(t,
(p,
shows that
i.
t
if
q)
(r,
when
is
marked. (p,
u)
is
which case
u), in
u) then (p, q)
the entry
(t,
1
A q)
is
marked
considered, or the pair it
is
marked
at the time
similar induction is
(p,
it is
is
on the is
con-
considered. In any event
on the number of
marked then p and q
q)
at line (5). If (p, q)
pairs
are distinguishable.
marked
EXERCISES
The algorithm of rithm, although
have n
it is
Fig. 3.8
more
is
not the most 2
efficient
71
than the obvious marking algo-
efficient possible.
Let
E have
k symbols and
Q
The loop of lines 2 through 7 is executed 2 0(n ) times, at most once for each pair of states. The total time spent on lines 2 through 4, 6, and 7 is 0(kn 2 ). The time spent on line 5 is the sum of the length of all lists. But each pair (r, s) is put on at most k lists, at line 7. Thus the time spent on Line
states.
line 5 is
0(kn
Theorem
)
steps.t
so the total time
DFA
The
removed,
also 0(kn
2 ).
minimum
state
DFA
for
its
3.8,
DFA
,
with inacces-
language.
to which the algorithm q 0 F) be the DFA constructed. That is,
S,
F)
I, S\ [q 0 ],
is
constructed by the algorithm of Fig.
the
is
M = (Q, X,
Let
Proof
),
3.11
sible states
M' = (Q\
2
takes 0(n
1
is
applied and
be the
Q = {[q] q *" = {Mk
is
accessible from q 0 },
is
inF)
|
and S'([q],
show
easy to
It is
<5(p, a).
from
That
is, if
that
S' is
d(q a)
is
y
=
-[%«)]. x,
q = p, then S(q, a) = then ax distinguishes q
vv)]
by induction on |w|.
consistently defined, since
distinguished from
show
also easy to
p. It is
Thus L(M')
a)
that
<5'([g 0 ]>
w)
(5(p,
=
a)
by
[$(q 0 >
if
L(M).
Now we
must show that M' has no more states than R L has equivalence L = L(M). Suppose it did; then there are two accessible states q and = q, d(q 0 y) = p, p in Q such that [q] [p], yet there are x and y such that d(q 0 x) and xR L y. We claim that p = q, for if not, then some w in E* distinguishes p from But then xwR L yw is false, for we may let z = e and observe that exactly one of xwz and ywz is in L. But since R L is right invariant, xwR L yw is true. Hence g and p do not exist, and M' has no more states than the index of R L Thus M' is the classes,
where
,
,
.
minimum
state
DFA
L
for
EXERCISES Which of
3.1
a) {0
2n
\n>
b) {0 m l"0 m + n c) {0 n n is |
the following languages are regular sets? Prove your answer.
1}
|m
>
1
>
and n
1}
a prime}
d) the set of all strings that
do not have three consecutive O's. number of O's and l's. and x = x R } x R is x written backward; for example,
e) the set of all strings f)
g)
{x\x in (0 + 1)*, {xwx R \x, w in (0
*h) {xx R w\x
t
We
f
win
say that g(n)
is
(0
+ +
with an equal
(Oil)*
=
110.
+ 1)
}
+ 1)
0(/(n))
}
if
there exist constants c and n 0 such that g(n)
<
cf(n) for
all
n
>
n0
.
PROPERTIES OF REGULAR SETS
72
3.2
Prove the following extension of the pumping lemma for regular sets. Let L be a set. Then there exists a constant n such that for each z x z 2 > z 3y with z t z 2 z 3 in Land = n,z 2 can be written z 2 = uvw such that v > 1 and for each i > 0, z uv*wz 3 is in L
regular zi
I
|
,
|
3.3
Use Exercise
3.4
Let
L be
3.2 to
a regular
a) {a x a 3 a 5 -- a 2n -
S
l
set.
Which
|i
>
x
1,
m>
1} is
•
of the following sets are regular? Justify your answers, •
1
e)
CYCLE(L) = {x x 2 \x 2 x is in L for strings x and x 2 } MAX(L) = {x in L|for no y other than e is xy in L} MIN(L) = {x in L|no proper prefix of x is in L}
f)
INIT(L)
=
g)
LR =
xR
c)
d)
t
h) {x
{x
|
xx R
|
is
{x|for
x
some
r
xy
y,
in L}
is
in L}
is
in L]
Let value(x) be the result
3.5
nonregular.
a x a 2 a 3 aA a 2n is in L} 1 a 2n a 2n -i \a x a 2 •• a 2n is in L}
aia A a z
b) {a 2
prove that {0 l m 2 m
\
when
the symbols of
x are multiplied from
to right
left
according to the table of Fig. 2.31.
L= L=
a) Is
b) Is Justify
{xy ||x|
3.6
Show Let
it
is
\y
\
and value(x)
=
=
value(y)} regular?
value(y)} regular?
your answers.
3.7
3.8
=
{xyjvalue(x)
A
L
that
{WV\gcd(uj)
=
1} is
not regular.
be any subset of 0*. Prove that L*
set of integers
is
linear
is
regular.
of the form {c + pi i = 0, 1, 2, .}. A set is semilinear if R ^ 0* be regular. Prove that {/|0' is in R} is
if it is
. .
\
the finite union of linear sets. Let
semilinear.
the class of regular sets closed under infinite union?
3.9
Is
3.10
What
is
and the
the relationship between the class of regular sets
languages closed under union, intersection, and complement containing 3.11
Give a
finite
automaton construction to prove that the
least class of
sets?
all finite
class of regular sets
is
closed
under substitution. of regular sets closed under inverse substitution?
3.12
Is the class
3.13
Let h be the
a)
Find /r
1
homomorphism
h(a)
^), where L = (10 + L 2 = (a + b)* x
=
01, h(b)
=
0.
1)*
b) Find h(L 2 \ where c)
!
Find h~ (L 3 ), where
and
L3
is
the set of all strings of 0's
and Ts with an equal number of 0's
Ts.
Show that 2DFA with endmarkers (see Exercise 2.20) accept only regular making use of closure properties developed in this chapter.
3.14
sets
by
The use of n with regular expressions does not allow representation of new sets. However it does allow more compact expression. Show that n can shorten a regular
3.15
expression by an exponential amount. [Hint:
What
is
length describing the set consisting of the one sentence
the regular expression of shortest (. ..
((al
a
2 x
)
a2 )
2
2 •
)
^]
)
EXERCISES
** 3.16
L
Let
be a language. Define \(L) to be {x for |
That
is,
** 3.17
If
\(L)
L
is
73
is
the
first
regular,
is
some y such
that |x
=
|
\y
|,
xy
is
in L}.
halves of strings in L. Prove for each regular the set of
thirds of strings in
first
L
L that \(L)
regular?
is
regular.
What about
the last
third? Middle third? Is the set
{xz for |
some y with
|
x
=
|
|y|
=
\z\,
xyz
is
in L}
regular? ** 3.18 a)
b)
Show
that
if
L
is
regular, so are
SQRT(L) = {x|for some y with |y| = |x| 2 xy LOG(L) = {x|for some y with |y| = 2 xy is ,
|JC|
,
A
* 3.19
one-pebble
placing a pebble
is
in L}
in
L}
2DFA is a 2DFA with the added capability of marking a tape square by
on
it.
The next
state function
depends on the present
state, the
tape symbol
scanned, and the presence or absence of a pebble on the tape square scanned.
A move
change of state, a direction of head motion, and possibly placing or removing the pebble from the scanned tape cell. The automaton "jams" if it attempts to place a second pebble on the input. Prove that one-pebble 2DFA's accept only regular sets. [Hint: Add two additional tracks to the input that contain tables indicating for each state p, the consists of a
state q in
which the
2DFA
will return
if it
under the assumption that the pebble
2DFA
is
moves
or right from the tape
left
operating on the augmented tape need never leave
homomorphic mapping
to
cell in state p,
not encountered. Observe that the one-pebble
remove the additional
its
pebble.
Then make use of a
tracks.]
In converting an NFA to a DFA the number of states may increase substantially. Give upper and lower bounds on the maximum increase in number of states for an n-state NFA. [Hint: Consider Exercises 2.5(e) and 2.8(c).]
* 3.20
3.21
Give a decision procedure to determine
if
DFA
the set accepted by a
is
a) the set of all strings over a given alphabet,
b) cofinite (a set
whose complement
is finite).
M
has at most n states and you wish Consider a DFA M. Suppose you are told that to determine the transition diagram of M. Suppose further that the only way you can obtain information concerning is by supplying an input sequence x and observing the prefixes
** 3.22
M
of x which are accepted. a)
What assumptions must you make concerning
the transition diagram of
M
in
order to
be able to determine the transition diagram? b) Give an algorithm for determining the transition diagram of
M (except for the start state)
including the construction of x under your assumptions in part
**S 3.23
Give an
efficient decision
procedure to determine
if
x
is
in the
(a).
language denoted by
an extended regular expression (a regular expression with operators u, *,
n, and
3.24
that
Give an
is
efficient decision
procedure for determining
pression r (a regular expression with u,
and time 0(2 |r| ) are
(concatenation),
complement).
sufficient.]
•,
*,
n) denotes
a
if
a semi-extended regular ex-
nonempty
set.
[Hint: Space 0( r |
|
PROPERTIES OF REGULAR SETS
74
o
1
A
Fig. 3.9
Find the minimum-state
3.25
finite
1
finite
automaton.
automaton equivalent
to the transition
diagram of
Fig. 3.9.
3.26
What
a)
are the equivalence classes of
forL = {0"r|>7> b)
Use your answer
c)
Repeat
R
3.27
is
regular
if
(a) for
RL
in (a) to
show {0T
\n
>
1}
{x\x has an equal number of
a congruence relation
and only
Myhill-Nerode theorem (Theorem
in the
3.9)
1}?
if it
is
if
xRy
and
wxzRwyz
implies
the union of
not regular. O's
some of
l's}.
for all
w and
z.
Prove that a
set is
the congruence classes of a congruence
relation of finite index.
Let
3.28
and
M be a
x be a
let
finite
automaton with n
states.
shortest string distinguishing p
Let p and q be distinguishable states of
and
q.
How
M
long can the string x be as a
function of n? In a two-tape
3.29
FA each state is designated as reading tape
1
or tape
2.
A pair of strings
FA, when presented with strings x and y on its respective tapes, reaches a final state with the tape heads immediately to the right of x and y. Let L be the set of pairs accepted by a two-tape FA M. Give algorithms to answer the following questions. (x,
accepted
is
y)
a) Is
L empty?
Do
c)
if
there exist
the
b) Is L finite? L and L2 such that L = L x L 2 ? x
x
3.30
Prove that there
a)
greater than cn
1
exists a constant
for infinitely
c> 0 such that the algorithm of Fig. 3.8 requires time DFA where n is the number of states and the input
many
alphabet has two symbols. * b)
Give an algorithm for minimizing states in a DFA whose execution time is 1 \n log n). Here I is the input alphabet. [Hint: Instead of asking for each pair of states (p, q) and each input a if <5(p, a) and S(q, a) are distinguishable, partition the
0(
1
and nonfinal states. Then refine the partition by considering all states whose next state under some input symbol is in one particular block of the partition. Each time a block is partitioned, refine the partition further by using the smaller subblock. Use list processing to make the algorithm as efficient as possible.]
states into final
Solutions to Selected Exercises 3.4(b)
be a
L=
DFA
symbols
{a 2
axa A a 3
accepting L.
in pairs.
•••
a 2n a 2 „-i aia 2 •• a 2n
is
in
L}
is
regular. Let
\
We
construct a
DFA M'
that accepts L.
On seeing the first symbol a in a pair, M' stores a in
M = (Q, I,
M'
(5,
q Qy F)
will process tape
its finite
control.
Then
)
EXERCISES
on seeing the second symbol
M' behaves
b,
M on the input ba. More formally
like
M' = (Q u Q x
75
Z, Z,
9o , F)
5',
where i)
(5'(g,
ii)
a)
=
<5'(fa, a],
[g, a],
To prove
and
=
b)
fra).
that
M'
<5'(g,
Clearly, for
i
=
L we show by
accepts
a 2 aia A a 3
=
0, d'(q, e)
q
=
•••
(5(qr, £).
induction on even
=
flffl,-!)
Assume
that
i
a,)-
(5(<7,
the hypothesis
true for all even j
is
<
i.
By
the induction hypothesis, 6'(q,
a2 a 1
•••
fl i
_2
fl l
_ 3)
=
5{q,
=
p
cna 2 *- 0,-2)
some
for
p.
Thus <5'(g,
a2 a !
•
•
a,- a,-
_
!
=
)
=
•••
Therefore a 2 a x a A a^
L(M')
=
0^,-1
is in
L(M')
if
a a _ t
{r
,
(5(p,
and only
if
a x a 2 •"
a,-
is
L(M), and thus
in
L.
One can clearly construct a finite automaton equivalent to R by combining finite automata corresponding to subexpressions of R and then simulating the automaton on x. We must examine the combining process to see how it affects the size of the resulting automaton. If we work with DFA's then the number of states for a union or intersection grows as the product. However, concatenation and closure may increase the number of 3.23
we need to convert DFA's to NFA's and then perform we work with NFA's, then the number of states is additive
states exponentially, as
the subset
construction. If
for union,
concatenation, and closure and increases as the product for intersection. However, comple-
ments require a conversion from an the
number of
tiated
not
in
NFA
to a
DFA and
Since operators can be nested, the
states.
on the order of n times
for
hence an exponential increase
number
in
of states can be exponen-
an expression with n operators, and thus
this
technique
is
general feasible.
A more
efficient
method based on a dynamic programming technique
(see
Aho,
Hopcroft, and Ullman [1974]) yields an algorithm whose execution time is polynomial in the length of the input w and the length of the regular expression s. Let n = |w| + \s\.
Construct a table which for each subexpression r of s and each substring x u of w gives the answer to the question: Is x i} in L(r), where x 0 is the substring of w of length j beginning at position n(n
+
/?The
table
is
of size at most h
3 ,
since there are at
l)/2 substrings of w. Fill in the table starting
(those without operators, that
of the forms r x
n
r2, r x
+
order of the length of x.
is,
a, £,
or 0).
r 2 , r x r 2 , rf, or
To
determine
Then
ir x
.
We
fill
most n subexpressions of s and
with entries for small subexpressions in entries for
x and
handle only the case
tf_x is in rf,
given that
we
r,
where
r?.
already
We
r is
of one
proceed
know
in
for each
PROPERTIES OF REGULAR SETS
76
proper substring y of x whether y is in r x or in rf, we need only check for each Xi and x 2 such that x = Xix 2 and Xi e, whether x t is in rj and x 2 is in rf. Thus to calculate the table entry for
0(h
4 ).
w=
x and
r
To determine
x lk where k ,
=
x + r we need only
requires time 0(
if
w
is in
s
|
\
|
).
Hence the time
to
fill
in the entire table is
|
consult the entry for s and w, noting that
w|. |
BIBLIOGRAPHIC NOTES The pumping lemma for regular sets is based on the formulation of Bar-Hillel, Perles, and Shamir [1961]. Theorem 3.4, closure under substitution, is also from there. Theorem 3.5, closure under inverse homomorphism, is from Ginsburg and Rose [1963b], and Theorem 3.6, on quotients, is from Ginsburg and Spanier [1963]. Theorems 3.7 and 3.8 on decision algorithms are from Moore [1956]. Ginsburg and Rose [1966] give a number of additional closure properties of regular sets.
Theorem 3.9, which we call the Myhill-Nerode Theorem, is actually due to Nerode The similar result of Exercise 3.27 on congruence relations is due to Myhill [1957]. The algorithm for minimizing finite automata is due to Huffman [1954] and Moore [1956]. Hopcroft [1971] gives a more efficient algorithm. Example 3.2, the unrecognizability of the primes in binary, was proved by Minsky and [1958].
Papert [1966] by another method. Proportional removal operations, such as Exercise 3.16, first studied in generality by Stearns and Hartmanis [1963]. Generalizations such as
were
Exercise 3.18 were considered by Kosaraju [1974] and Seiferas [1974], and the question of what functions of the string length may be removed from the front to yield regular sets was solved completely by Seiferas and McNaughton [1976]. A solution to Exercise 3.22 was first considered by Hennie [1964]. An algorithm for determining equivalence for deterministic
two-tape
FA
is
found
in
Bird [1973].
CHAPTER
4
CONTEXT-FREE
GRAMMARS
MOTIVATION AND INTRODUCTION
4.1
In this chapter describe sets,
we
introduce context-free grammars and the languages they
— the context-free languages. The context-free languages,
are of great practical importance, notably in defining
like the regular
programming
lan-
guages, in formalizing the notion of parsing, simplifying translation of program-
ming languages, and in other string-processing applications. As an example, context-free grammars are useful for describing arithmetic expressions, with arbitrary nesting of balanced parentheses, and block structure in programming languages (that is, begin's and end's matched like parentheses). Neither of these aspects of programming languages can be represented by regular expressions. A context-free grammar is a finite sef\>f variables (also called nonterminals or syntactic categories) each of which represents a language. The languages represented by the variables are described recursively in terms of each other and primitive symbols called terminals. The rules relating the variables are called productions.
A
typical production states that the language associated with a given
variable contains strings that are formed by concatenating strings from the lan-
guages of certain other variables, possibly along with some terminals.
The
original motivation for context-free
natural languages.
We may
grammars was the description of
write rules such as
(sentence) -> (noun phrase) (verb phrase)
(noun phrase)
-> (adjective)(noun phrase)
(noun phrase)
->
(noun)
(noun)
-*
boy
(adjective) ->
little
(4.1)
77
78
CONTEXT-FREE GRAMMARS
where the syntactic categories! are denoted by angle brackets and terminals by unbracketed words like "boy" and "little." The meaning of (sentence) -> (noun phrase)(verb phrase) is
that
one way to form a sentence
category (sentence))
is
to take a
(a string in the
language of the syntactic
noun phrase and follow
it
by a verb phrase. The
meaning of
(noun) is
->
boy
that the string consisting of the one-terminal
the syntactic category (noun).
Note
symbol "boy"
that "boy"
is
is
in the
language of
a single terminal symbol, not a
string of three symbols.
For a number of reasons, context-free grammars are not
in general
regarded
For example, if we extended the productions of (4.1) to encompass all of English, we would be able to derive "rock" as a noun phrase and "runs" as a verb phrase. Thus "rock runs" would be a sentence, which is nonsense. Clearly some semantic information as adequate for the description of natural languages like English.
is
necessary to rule out meaningless strings that are syntactically correct.
subtle problems arise
sentence with
when attempts
are
made
to associate the
derivation. Nevertheless context-free
its
More
meaning of the
grammars play an impor-
computer linguistics. While linguists were studying context-free grammars, computer scientists began to describe programming languages by a notation called Backus-Naur Form (BNF), which is the context-free grammar notation with minor changes in format and some shorthand. This use of context-free grammars has greatly simplified the definition of programming languages and the construction of compilers. The reason for this success is undoubtedly due in part to the natural way in which most programming language constructs are described by grammars. For tant role in
example, consider the
set
of productions
+
1)
(expression) -» (expression)
2)
(expression) -* (expression) * (expression)
3)
(expression) -> ((expression))
4)
(expression) ->
(expression)
id
(4.2)
which defines the arithmetic expressions with operators 4- and * and operands represented by the symbol id. Here (expression) is the only variable, and the terminals are +,*,(, ), and id. The first two productions say that an expression t Recall that the term "syntactic category"
with natural languages.
is
a synonym for "variable."
It is
preferred
when dealing
4.2
|
CONTEXT-FREE GRAMMARS
79
can be composed of two expressions connected by an addition or multiplication The third production says that an expression may be another expression
sign.
surrounded by parentheses. The last says a single operand is an expression. By applying productions repeatedly we can obtain more and more complicated expressions. For example, (expression) => (expression) * (expression) => ((expression)) * (expression) => ((expression)) * id => ((expression)
4-
(expression)) *
=> ((expression)
+
id) * id
=>
The symbol => denotes
(id
+
id
id) * id
(4.3)
the act of deriving, that
right-hand side of a production for that variable.
is,
The
replacing a variable by the first line
of (4.3)
is
obtained
from the second production. The second line is obtained by replacing the first (expression) in line 1 by the right-hand side of the third production. The remaining lines are the results of applying productions (4), (1), (4), and (4). The last line, (id 4- id) * id, consists solely of terminal symbols and thus is a word in the language of (expression).
4.2
CONTEXT-FREE GRAMMARS
Now we shall
formalize the intuitive notions introduced in the previous section.
A
grammar (CFG or just grammar) is denoted G = (V, T, P, S), where are finite sets of variables and terminals, respectively. We assume that V
context-free
V and T and T are disjoint. P is a finite set of productions; each production is of the form A -> a, where A is a variable and a is a string of symbols from (V u T)*. Finally, S is
a special variable called the start symbol.
Example 4.1 Suppose we use £ instead of (expression) for the variable grammar (4.2). Then we could formally express this grammar as ({£}, { +
,
id},
P, £),
where
P
consists of
£->£ + £
£->£*£ £-(£) £-*id
in
the
*,(,),
,
CONTEXT-FREE GRAMMARS
80
two chapters we use the following conventions regarding
In this and the next
grammars. 1)
The
capital letters A, B, C,
D
9
E,
and S denote
variables;
S
is
the start symbol
unless otherwise stated. 2) 3)
The lower-case The
letters a, b, c, d, e y digits,
capital letters
X,
Y,
and
and boldface
strings are terminals.
Z denote symbols that may be either terminals or
variables. 4)
The lower-case
5)
The lower-case Greek
letters u, v, w, x, v,
and z denote
and
letters a, p,
strings of terminals.
and
y denote strings of variables
terminals. to the above conventions, we can deduce the variables, terminals, symbol of a grammar solely by examining the productions. Thus we often present a grammar by simply listing its productions. If A -> ol u A -> a 2 A->ak are the productions for the variable A of some grammar, then we may express them by the notation
By adhering
and the
start
.
,
i4->a,|a 2 |---|a k where the
vertical line
is
read "or."
The
entire
.
.
,
grammar
of Example 4.1 could be
written
£->£ + £(£*
E|(£:) id |
Derivations and languages
We now formally define the language generated by a grammar G = (V, T, P, S). To do so, we develop notation to represent a derivation. First we define two relations =g> and %- between strings in (V u T)*. If A -> P is a production of P and a and y are any strings in (V u T)*, then aAy ^> a/fy. We say that the production A -+ P is applied to the string ccAy to obtain a/fy or that cuAy directly derives ccfiy in grammar G. Two strings are related by =?> exactly when the second is obtained from the first by one application of some production. (
Suppose that a l9 a 2
,
«1
Then we say
a OT are strings in
^ «2>
<*2
^ «3»
aj ^> a m or aj derives a m in
•
• • >
(Ku a m-
grammar
transitive closure of => (see Section 1.5 for
1
7*)*,
m>
1,
and
? «m-
G. That
is,
is
the reflexive and
a discussion of closures of relations).
=?> p if p follows from a by application of zero or more productions Note that a =?> a for each string a. Usually, if it is clear which grammar G is involved, we use => for =» and ^> for If a derives P by exactly i steps, we say
Alternatively, a
of P.
The language generated by G [denoted L(G)]
is
{w
|
w is in T* and S => w}. That
4.2
is,
a string
in
is
L(G)
The
string consists solely of terminals.
2)
The
string can
La
call
be derived from
context-free language
terminals and variables a
G and G 2 x
81
if:
1)
We
CONTEXT-FREE GRAMMARS
|
is
to be equivalent
S.
(CFL)
if it is
L(G)
called a sentential form if
= L(G 2
L(G t )
for
if
CFG G. A string of We define grammars
some
S ^>
a.
).
4.2 Consider a grammar G = (F, T, P, 5), where K = {S}, 7 = {a, b} and P = {S->flS6, S-+ab). Here, S is the only variable; a and 6 are terminals. There are two productions, S -> aSfr and S -> ab. By applying the first production n — 1 times, followed by an application of the second production, we have
Example
~ S => aSb => aaSbb => a 3 Sb 3 =>•••=> an ~ 1 S^ n
Furthermore, the only strings
in
n
L(G) are a b
n
for
rc
>
1
=>
1.
.
Each time S
->
aS6
is
number of S's remains the same. After using the production S-+abwe find that the number of S's in the sentential form decreases by one. Thus, after using S -> ab, no S's remain in the resulting string. Since both productions have an used, the
S on the
the only order in which the productions can be applied
left,
some number of times followed by one application of S n
n
{a b \n
>
is
-* ab. Thus,
S -> aSb L(G) =
1}.
Example 4.2 was a simple example of a grammar. It was relatively easy to determine which words were derivable and which were not. In general, it may be exceedingly hard to determine what
more
difficult
Example
generated by the grammar. Here
is
another,
*
Consider
4.3
is
example.
G=
(V, T, P, S),
where V
=
{S A, £}, y
T=
{a, b} y
and P
consists of the following:
A->
A
bA
B-+b
a
B->bS
A-> aS The language L(G) of a's
and
fr's.
1)
S^>
2)
A ^> w
if
and only
3)
B^>w
if
and only
and only
For w
in
->
B -> aBB +
words in T consisting of an equal number statement by induction on the length of a word.
the set of
We shall prove this
Inductive hypothesis vv if
is
bAA
aB
T+
all
,
w consists of an equal number of a's and if w has one more a than it has fr's. if w has one more b than it has a\
if
Vs.
CONTEXT-FREE GRAMMARS
82
The inductive hypothesis is certainly true if w = 1, since A ^> a, B b, and no terminal string oflength one is derivable from S. Also, since all productions but A -> a and B -> b increase the length of a string, no strings of length one other than a and b are derivable from A and 5, respectively. Also, no strings of length one are |
|
derivable from S.
Suppose that the inductive hypothesis
We shall show that
it is
true for
w = |
|
aB or S -> bA. In
begin with either S ->
true for all
is
w
of length k
—
1
or
less.
S ^> w, then the derivation must case, w is of the form aw u where
k. First, if
the
first
\w 1 = k — 1 and B^> w v By the inductive hypothesis, the number of Vs in w 1 is one more than the number of a's, so w consists of an equal number of a's and Vs. A similar argument prevails if the derivation begins with S -> bA. We must now prove the "only if" of part (1), that is, if w = k and w consists w. Either the first symbol of w is a or it of an equal number of a's and tfs, then S 1
|
|
^
is b.
Assume that w = aw v Now Wj = k — 1, and w has one more b than it has By the inductive hypothesis, B^>Wj. But then S=> aB^>aw = w. A similar |
a's.
t
|
t
argument
Our
prevails
task
is
if
the
symbol of w
first
not done.
is b.
To complete the proof, we must prove parts (2) and (3) w oflength k. We do this in a manner similar to our
of the inductive hypothesis for
method of prpof
for part (1); this part
to the reader.
is left
DERIVATION TREES
4.3 It
These pictures, called derivation (or on the words of a language that is useful in
useful to display derivations as trees.
is
parse) trees, superimpose a structure
applications such as the compilation of programming languages.
The vertices of a grammar or
derivation tree are labeled with terminal or variable symbols of the possibly with
X2
,
•
.
•
,
X
k
c.
If an interior vertex n is labeled
from the
shows the parse
left,
then
A-+X X 2
"'
l
tree for derivation (4.3).
left-to-right order,
we
A, and the sons of n are labeled
X
get the last line of (4.3), (id
+
id) * id.
(
+
*
id
)
I
I
id
id
Fig. 4.1
X
lt
must be a production. Figure 4.1 Note that if we read the leaves, in k
Derivation
tree.
4.3
More tree for
G
formally,
G=
let
CFG. A tree is a derivation
(V, T, P, S) be a
83
(or parse)
if:
1)
Every vertex has a
2)
The
tafce/,
label of the root
3) If a vertex
interior
is
4) If n has label
from the
A
and
must be a production
and has
A must
label A, then
{e}.
X -^2
*
be in V.
are the sons of vertex
X u X 2t
e,
then n
is
k
,
w, in
order
respectively, then
* * *
"^Jfc
a leaf and
Consider the grammar
4.4
Fu Tu
a symbol of
in P.
has label
5) If vertex n
is
vertices n l9 n 2i
with labels
left,
which
is S.
A
Example
DERIVATION TREES
|
G=
is
the only son of
({S, A), {a,
b\ P,
S),
its
where
father.
P consists of
S -> a>4S a |
We The
draw a
tree, just this
vertices will be
vertices.
The from the
See Fig.
have labels
vertex 3 has label A,
A
SbA
is
a,
1, 3, 4, 5,
and
from the
conditions for
7.
labels will be adjacent to the
Vertex
A, and S. Note that S
and the
1
has label S, and
aAS is a production.
labels o&its sons are S, b,
also a production. Vertices 4
and
5 each
and
have label
A
S.
its
sons,
Likewise,
from the
left.
Their only sons
-> a is a production. Lastly, vertex 7 has label A and its have labels b and a. A^ba is also a production. Thus, the Fig. 4.2 to be a derivation tree for G have been met.
each have label sons,
The
for reference.
4.2.
interior vertices are left,
once, with circles instead of points for the vertices.
numbered
a,
and S
left,
Fig. 4.2
Example of a derivation
tree.
CONTEXT-FREE GRAMMARS
84
We may extend ordering of
all
the "from the left" ordering of sons to produce a left-to-right
the leaves. In
ancestor of the other, one
is
any two
fact, for
to the
which
vertices, neither of
of the other. Given vertices v l and
left
v2,
an
is
follow
some vertex w. Let and v 2 respectively. If i\ is not an Suppose x x is to the left of x 2 in the
the paths from these vertices toward the root until they meet at
Xj and x 2 be the sons of
w on
the paths from v x
ancestor of v 2 or vice versa, then x 1
j=
,
Then
ordering of the sons of w. the
left
of v v For example,
=
and x 2
As
7.
5
We shall see
to the
is
if
v1
left
v l is to
and
of
from
.
left
,
of v 2 In the opposite case, .
9 and
1 1 in
follows that 9
that a derivation tree
to right,
left
the
v 2 are
7, it
a particular sentential form of the
x2
Fig. 4.2, then
is
to the
of
left
v 2 is to
w is 3, x = 1
5,
11.
a natural description of the derivation of G. If we read the labels of the leaves
is
grammar
we have a sentential form. We call this string the yield of the we shall see that if a is the yield of some derivation tree for
derivation tree. Later,
grammar G =
(V, T, P, S), then S => a, and conversely. need one additional concept, that of a subtree.
We tree
is
a particular vertex of the tree together with
connecting them, and their the label of the root labels the root, then
"derivation tree"
Example
if
may
we
S
is
labels. It
A
subtree of a derivation
all its
descendants, the edges
looks just like a derivation
tree,
not be the start symbol of the grammar.
call the
subtree an A-tree. Thus "S-tree"
is
a
except that
If variable
synonym
A
for
the start symbol.
Consider the grammar and derivation tree of Example 4.4. The is reproduced without numbered vertices as Fig. 4.3(a).
4.5
derivation tree of Fig. 4.2
The
yield of the tree in Fig. 4.3(a)
is
aabbaa. Referring to Fig. 4.2 again, we see that
numbered
2, 9, 6, 10, 1 1, and 8, in that order, from the These vertices have labels a, a, b, b, a, a, respectively. Note that in this case all leaves had terminals for labels, but there is no reason why this should always be so; some leaves could be labeled by c or by a variable. Note that S|> aabbaa by
the leaves are the vertices left.
the derivation
S => aAS => aSbAS => aabAS => aabbaS => aabbaa. Figure 4.3(b) shows a subtree of the tree illustrated Fig. 4.2, together with
its
the root of the subtree
is
descendants. A, and
A
The
if
Proof
we
^> abba.
SbA
=>
in part (a). It is vertex 3 of
yield'of the subtree
A
abA
is
abba.
derivation in this case
The
label of
is
=> abba.
relationship between derivation trees and derivations
Theorem only
=>
A
The
We
shall
Let
4.1
there
is
shall find
prove
as the yield.
G=
(V, T, P, S)
a derivation tree in
is
it
be a context-free grammar. Then S ^> a
grammar G
easier to prove
that for
any
A
in V,
with yield
something
A ^> a
if
in excess of the
and only
if
if
and
a.
there
is
theorem.
an
/1-tree
What with a
4J
Suppose,
number of
vertex, the tree
be
a,
that a
first,
is
and A
85
We prove,
by induction on the only one interior In that case, X 1 X 2 Xn must
the yield of an ,4-tree.
interior vertices in the tree, that
DERIVATION TREES
|
A ^> a.
If there is
must look like the one in Fig. 4.4. must be a production of P, by definition of a derivation •
•
-> a
tree.
A
Fig. 4.4
Now, suppose
that the result
Also, suppose that a
k>
1.
Tree with one interior vertex.
is
is
true for trees with
Consider the sons of the root. These could not
X X
up
—
to k
1
interior vertices.
some
the yield of an ,4-tree with k interior vertices for
be leaves. Let the labels of
all
X
X X
X
-> n in order from the left. Then surely, A u 2 X 2 n is a production in P. Note that n may be any integer greater than or equal to one in
the sons be
the
.
,
argument that
-
. ,
'
'
'
follows.
the root of a subtree, and X must be a variable. and has some yield a,. If vertex is a leaf, let a,- = X It is easy to see that if j < vertex j and all of its descendants are to the left of vertex i and all of its descendants. Thus a = a, a a„. A subtree must have 2 If the ith
son
is
not a
leaf,
The subtree must be an
it is
f
X,-tree
.
i
/,
t
•
fewer interior vertices than
its
•
•
tree does, unless the subtree
X
inductive hypothesis, for each vertex i that is not a leaf, with root Xi is not the entire tree. If = a„ then surely t these partial derivations together, to see that
X
A=>X X 2 X
X n ^>oc X 2 1
•••
X n *>
l
*2
X
•• 3
is s
the entire tree.
By the
^> a h since the subtree
X
t
^>
a,.
X n *>--*>cx
l
We can 2
•a
n
put
=
all
a.
(4.4)
Thus A^xx. Note that (4.4) is only one of many possible derivations we could produce from the given parse tree.
CONTEXT-FREE GRAMMARS
86
Now, suppose If
A ^> a
by a
that
A ^> a. We must show that there is an yl-tree with yield a. A -» a is a production in P, and there is a tree with
single step, then
form shown
yield a, of the
Now, assume steps, then there
that for
in Fig. 4.4.
any variable
A
if
A ^> a
by a derivation of fewer than
/c
with yield a. Suppose that A ^> a by a derivation of k 'step be A-* l 2 n It should be clear that any symbol in
an
is
,4-tree
XX X Xu X X or be derived from one of these. Also, the portion of a derived from X must to the of the symbols derived from X steps. Let the first
.
a must either be one of
2>
n
lie
t
< j.
i
Thus, we can write a as
=X
1)
a
2)
Xi^xxi
t
If
X
f
(
is
if
X
if
X,
{
ol
2
is
a terminal, and
is
a variable.
left
••• a„,
jy
where
for each
i
between
1
and
if
n,
AT, must take fewer than k and the first step is surely not
a variable, then the derivation of a from f
steps, since the entire derivation
part of the derivation
X
a variable, there
an
is
ol
x
is
t
^> a
f
.
A ^> a takes
/c
steps,
Thus, by the inductive hypothesis, for each
X r tree with
yield a f Let this tree be .
X
{
that
7J.
X lf X 2 X n and Each vertex with label X h where X is not terminal, is replaced by the tree 7J. If X is a terminal, no replacement is made. An example appears in Fig. 4.5(b). The yield of this tree is a. We begin
no other
by constructing an
vertices.
This tree
is
,4-tree
shown
with n leaves labeled
,
. .
.
,
,
in Fig. 4.5(a).
t
t
A
A
(b)
(a)
Fig. 4.5
Derivation
trees.
Consider the derivation S ^> aabbaa of Example 4.5. The first step 4.6 S -> a AS. If we follow the derivation, we see that A eventually is replaced by SbA, then by abA, and finally, by abba. Figure 4.3(b) is a parse tree for this derivation. The only symbol derived from S in aAS is a. (This replacement is the
Example is
last step.)
Figure 4.6(a)
Figure 4.6(b) label
A
is
is
a tree for the latter derivation.
the derivation tree for 5 -> aAS. If
in Fig. 4.6(b)
by the
tree of Fig. 4.3(b)
4.6(b) with the tree of Fig. 4.6(a),
aabbaa.
we
we
replace the vertex with
and the vertex with
get the tree of Fig. 4.3(a),
label S in Fig. whose yield is
4.4
|
SIMPLIFICATION OF CONTEXT-FREE
S
GRAMMARS
87
S
aAS
a
(b)
(a)
Derivation
Fig. 4.6
trees.
Leftmost and rightmost derivations; ambiguity
each step
If at
the derivation
in a derivation a is
production
is
applied to the leftmost variable, then
said to be leftmost. Similarly a derivation in
which the rightmost
in L(G) for CFG G, one parse tree, and corresponding to a particular parse tree, w has a unique leftmost and a unique rightmost derivation. In the proof of Theorem 4.1, the derivation of a from A corresponding to the parse tree in question is ^> a are made leftmost. If instead of derivaleftmost, provided the derivations tion (4.4) we (recursively) made the derivation X ^> a be rightmost and replaced
variable
replaced at each step
is
w has
then
is
said to be rightmost. If
w
is
at least
X
{
{
t
f
from the right rather than the left, we would obtain the rightmost derivation corresponding to the parse tree. Of course, w may have several rightmost or leftmost derivations since there may be more than one parse tree for w. However, it is easy to show that from each derivation tree, only one leftmost and one rightmost derivation may be obtained. Also, the construction of Theorem 4.1 produces different derivation trees from the Xi's by
a,-
different leftmost or different rightmost derivations.
Example
4.7
The
leftmost derivation corresponding to the tree of Fig. 4.3(a)
is
S => aAS => aSbAS => aabAS => aabbaS => aabbaa. The corresponding rightmost derivation
is
S => aAS => aAa => aSbAa => aSbbaa => aabbaa.
A ity is
grammar G such that some word has two parse trees is said to From what we have said above, an equivalent definition of ambigu-
context-free
be ambiguous. that
some word has more than one
rightmost derivation. inherently
ambiguous CFL.
guous CFL's
4.4
A CFL
for
We
leftmost derivation or
which every
shall
show
CFG
is
ambiguous
more than one is
said to be an
in Section 4.7 that inherently
ambi-
exist.
SIMPLIFICATION OF CONTEXT-FREE
GRAMMARS
There are several ways in which one can restrict the format of productions without reducing the generative power of context-free grammars. If L is a nonempty
CONTEXT-FREE GRAMMARS
88
context-free language then
it
can be generated by a context-free grammar
G with
the following properties.
Each variable and each terminal of G appears
1)
in the derivation of some
word
in L.
There are no productions of the form
2)
Furthermore, fact, if e is
if e is
we can
A
->
where a
aoi,
A and B
-> a,
require that every production of
where A, B, and
is
C
G
A ->
e.
In
be of one of the
every production of
G
is
an
be of the
a string of variables (perhaps empty). These two special
Chomsky normal form and Greibach normal
forms are called
are variables.
are arbitrary variables and a
we could make
arbitrary terminal. Alternatively,
form
where
not in L, there need be no productions of the form
not in L,
A-> BC and A
forms
A -> B
form, respectively.
Useless symbols
We now
undertake the task of eliminating useless symbols from a grammar. Let
A symbol X is useful if there is a derivation and w, where w is in 7* (recall our convention regarding names of symbols and strings). Otherwise X is useless. There are two aspects to usefulness. First some terminal string must be derivable from X and second, X must occur in some string derivable from S. These two conditions are G=
be a grammar.
(V, T, P, S)
S^xxX($^>w
for
some
a,
not, however, sufficient to guarantee that
X
sentential forms that contain a variable
from which no terminal string can be
is
useful, since
X
may
occur only
in
derived. 4.1 Given a CFG G = (V, T, P, S\ with L(G) + 0, we can effectively there is some an equivalent CFG G' = (V, T, F, S) such that for each A in in T* for which A^>w.
Lemma
V
find
w
Each variable A with production A -> w in P clearly belongs in V. If X n is a production, where each X is either a terminal or a variable already placed in V\ then a terminal string can be derived from A by a derivation beginning /4=>X 1 X 2 •• X n and thus /I belongs in V. The set K' can be computed by a straightforward iterative algorithm. F is the set of all productions whose symbols are in V u T. The algorithm of Fig. 4.7 finds all variables A that belong to V. Surely if A is added to NEWV at line (2) or (5), then A derives a terminal string. To show NE WV is not too small, we must show that if A derives a terminal string w, then A is eventually added to NEWV. We do so by induction on the length of the Proof
A
->
XjA^
•
•
•
f
,
derivation
Basis
A ^> w.
If the length is one,
A
then
-*
w is
a production, and
A
is
added
to
NEWV
in step (2).
Induction write
Let
w = w w2 l
A=>X X 2 X
"'
-• w„, where
X n ^>w
X
f
^>
w
f,
by a derivation of /c steps. Then we may 1 < i < n, by a derivation of fewer than
for
a 4.4
.
SIMPLIFICATION OF CONTEXT-FREE
|
GRAMMARS
89
begin
OLDV:= 0; NEWV:= {A A -* w for some w while OLDV + NEWV do
1)
2)
\
3)
in
7*};
begin
OLDV:= NEWV; NEWV:= OLDV u
4) 5)
{A
|
A
-> a for
some a
in
(7
u OLDV)*}
end;
K':=
6)
NEWV
end Fig. 4.7
By
k steps.
Calculation of
the inductive hypothesis, those
X
{
V
that are variables are eventually
added to NEWV. At the while-loop test of line (3), immediately after the last of the X s is added to NEWV, we cannot have NEWV = OLDV, for the last of these X^s is not in OLDV. Thus the while-loop iterates at least once more, and A will be added to NEWV at line (5). Take V to be the set computed at line (6) and to be all productions whose symbols are in V u 7. Surely G' = (V, 7, F, S) satisfies the property that if A is in V, then A ^> w for some w. Also, as every derivation in G' is a derivation of G, y
t
F
we know L(G')
£
L(G). But
w
in
G must
derivation of
(which implies there
V—
V that derives
Lemma
CFG (r
=
The
structed
A-±u. x
set
• - 1
Vu
L(G) not in L(G'), then any or a production in P — P' used). But then there is a variable in
a variable in
V
V—
in
V— V
CFG G =
(V, T,
P
f
S)
we can
X
in
effectively find
Vu T
an equivalent and £ in
there exist a
|a„,
Vu
symbols appearing in sentential forms of G is conPlace S in V If A is placed in V and then add all variables of a x a 2 a„ to the set F and all only n to 7\ P' is the set of productions of P containing 7' of
iterative algorithm.
terminals of o^, a 2 ,
symbols of
some w
which S:S>aX0.
by an
|a 2
is
(V, T, F, S) such that for each
u r)*for
Proof
there
a terminal string, a contradiction.
Given a
4.2
G'
is
if
involve a variable in
.
,
,
T.
By first applying Lemma 4.1 and then Lemma 4.2, we can convert a grammar an equivalent one with no useless symbols. It is interesting to note that applying Lemma 4.2 first and Lemma 4.1 second may fail to eliminate all useless symbols. to
Theorem
4.2
Every nonempty
CFL
is
generated by a
CFG with
no
useless
sym-
bols.
L=
Jjfi) be a nonempty CFL. Let G be the result of applying the Lemma 4.1 to G and let G 2 be the result of applying the construction of Lemma 4.2 to G Suppose G has a useless symbol X. By Lemma 4.2, there 2 v is a derivation S §> aXp. Since all symbols of G 2 are symbols of G 1? it follows from
Proof
Let
construction of
x
CONTEXT-FREE GRAMMARS
90
Lemma 4.1
that
the derivation string in
G2
Example
4.8
,
S §>
ctXf}
aXf$=>w
and hence
|> w for some terminal string w. Therefore, no symbol in eliminated by Lemma 4.2. Thus, derives a terminal
X
is
X
not useless as supposed.
is
Consider the grammar
S-*AB\a
A^a Lemma
Applying
4.1,
we
no terminal string is derivable from S -> AB. Applying Lemma 4.2
find that
B and
therefore eliminate
(4.5)
the production
B.
We
to the
grammar
S-+a
A->a we
(4.6)
S and a appear in sentential forms. Thus ({S}, {a}, {S -* a}, S) is an grammar with no useless symbols. Suppose we first applied Lemma 4.2 to (4.5). We would find that all symbols appeared in sentential forms. Then applying Lemma 4.1 we would be left with find that only
equivalent
(4.6),
which has a useless symbol, A.
e-Productions
We now
turn our attention to the elimination of productions of the form A-+e,
which we
Surely
call e-productions.
productions from G, but
if e is
to determine for each variable
replace each production
B ->
A
whether
X X X
2
in L(G),
if e is
not in L(G),
"
X
'
it
A ^> e. n
by
If so,
all
out some subset of those X-s that are nullable, but all
X-s
we cannot eliminate all ewe can. The method is we call A nullable. We may
turns out that
productions formed by striking
we do not
include
B ->
e,
even
if
are nullable.
If L = L(G) for some CFG G = (V, T,P, S), then L - {e} is L(G') for 4.3 CFG G with no useless symbols or £-productions. Proof We can determine the nullable symbols of G by the following iterative
Theorem a
if A -> e is a production, then A is nullable. Then, if B -> a is a symbols of a have been found nullable, then B is nullable. We repeat this process until no more nullable symbols can be found. X n is in is constructed as follows. If A -> X X X 2 " The set of productions
algorithm.
To
begin,
production and
all
F
P, then
add
1) if
2)
if
X
t
all
productions
*
A -> a t a 2
is
not nullable, then a
is
nullable, then a
3) not all a,'s are
e.
£
is
f
=X
either
*
*
a„ to
f ;
X
£
or
e;
F where
4.4
=
Let G" only
if
w
Let
If
=j=
P\
(V, T,
S).
We claim
that for
91
A'mV and w in T*,A=>w if and
all
A => w.
and
e
SIMPLIFICATION OF CONTEXT-FREE GRAMMARS
|
A => w and w ^ 6. We prove by induction on that A => w. The basis, = 1, for ,4 -* w must be a production in P. Since w ^ £, it is also a production i
i
is trivial,
of
For the inductive
P'.
w = w w2
*
*
wn
•
1
X
such that for each
7,
1.
Then
Xj=> Wj
a variable, then by the inductive hypothesis
is
nullable. Thus,
=
and
e if
vv,-
=
A
/? x /? 2
Since
e.
A=>P P 2 - Pn ^w P2 i
1
•
w^ '~
/?„ is
• •
£,
not
A^X X
2
than
i
1
in fewer
is ;
Xj
i>
step, let
X„'^
•••
steps. If
we have Xj |> w;
-.
are
e.
If vv,-
^ =
e
and then
£,
P', where /? = X if vv,- ^ } ; Hence we have a derivation
a production in
all fij
w. Write
vv,-
.
Pn^w^Ps- )?>--^w lW
;
2
e
--wn = w
in G".
0n/y z/ Suppose v4 =^ w. Surely w # £, since G" has no e-productions. We G show by induction on i that A => w. For the basis, i = 1, observe that A -> w is a production in P'. There must be a production ^ -> a in P such that by striking out certain nullable symbols from a we are left with w. Then there is a derivation A => a => w, where the derivation a ^> w involves deriving e from the nullable symbols of a that were struck out in order to get w.
For the induction step, let i > 1. Then A=> X X X 2 "' X n => w. There must be some A -> /? in P such that XjA^ ••• X„ is found by striking out some nullable symbols from p. Thus A => Xn Write w = w 1 w 2 w„, such that for all Xj=>Wj by fewer than i steps. By the inductive hypothesis, AT,=>vv,- if AT ; is a variable. Certainly if Xj is a terminal, then vv,- = JV,-, and X => Wj is trivially true. ; Thus A => w. The last step of the proof is to apply Theorem 4.2 to G" to produce G' with no useless symbols. Since the constructions of Lemmas 4.1 and 4.2 do not introduce any productions, G' has neither nullable symbols nor useless symbols. Furthermore S=> w if and only if w + e and S |> w. That is, L(G') = L(G) - {c}. l
•
• *
•
.
•
-
From here on we assume that no grammar has useless symbols. We now turn our attention to productions of the form A-+B whose right-hand side consists of a single variable. those of the form
Theorem
4.4
We call these A
-> a
Every
unit productions. All other productions, including
and e-productions, are nonunit productions.
CFL
without
e
is
defined by a
grammar with no
useless
symbols, ^-productions, or unit productions.
be a CFL without e and L = L(G) for some G = (K, T, P, S). By assume G has no 6-productions. Construct a new set of productions 4.3, P' from P by first including all nonunit productions of P. Then, suppose that A => for ,4 and £ in V. Add to P' all productions of the form A-*a, where B-kx
Proof
Let
L
Theorem
is
a nonunit production of P.
CONTEXT-FREE GRAMMARS
92
Observe that we can easily
and
A => B,
whether
test
since
G
has no £-productions,
if
A T B iT B 2T-T BmT B and some variable appears twice unit productions that results in
in the
sequence,
A => B. Thus
>
we can
it is
find a shorter sequence of
sufficient to consider
only those
sequences of unit productions that do not repeat any of the variables of G.
We now
have a modified grammar, G'
production of F, then A=j>a. Thus, derivation of
w
is
=
(V T, F, y
S).
Surely,
if
A ->a
is
a
is
a
a derivation of w in G', then there
in G.
Suppose that
< <
there
if
w
is
in L(G),
and consider a leftmost derivation of
w
in G, say
by a nonunit production, then a, ^> ct i+1 Suppose that by a nonunit production, or = 0. Also suppose that a + j ^> a + 2 ^> •* g> a,-, all by unit productions, and a, are all of the same a,=>a;+1 by a nonunit production. Then a a i+1 length, and since the derivation is leftmost, the symbol replaced in each of these must be at the same position. But then a =>a, + 1 by one of the productions of — P. Hence L(G') = L(G). To complete the proof, we observe that G' has no unit productions or ^-productions. If we use Lemmas 4.1 and 4.2 to eliminate useless symbols, we do not add any productions, so the result of applying the for
If,
a
.
0
i
=>
n, a,
>a i+i by a ^
a,
+
.
x
unit production, but that a,_j =>a,
i
i
l
t
,
,
f
F
constructions of these
lemmas
to G'
a
is
grammar
satisfying the theorem.
CHOMSKY NORMAL FORM
4.5
We now
prove the
context-free
first
grammars
of two normal-form theorems. These each state that
are equivalent to
grammars with
restrictions
all
on the forms
of productions.
(Chomsky normal form, or CNF) Any context-free language without grammar in which all productions are of the form A BC or A-*a. Here, A, £, and C, are variables and a is a terminal.
Theorem e is
4.5
generated by a
Let G be a context-free grammar generating a language not containing e. By Theorem 4.4, we can find an equivalent grammar, G = (V, T, P, S), such that P contains no unit productions or e-productions. Thus, if a production has a
Proof
l
symbol on the right, that symbol an acceptable form.
single in
Now consider a production If
X
is
of an allowable form.
t
is
a terminal,
and the new t
Note
that
G2
a,
introduce a
Then
set of productions
is
not yet
in P,
in
new
replace
is
a terminal, and the production
variable t
by
Ca
Ca .
A
-+
X X x
form.
•
• •
2
Xm
and a production
Let the
new
be F. Consider the grammar
Chomsky normal
already
v
of the form
X
is
set
,
where
Ca
m>
2.
which of variables be V
G2 =
a,
(V\ T, F, S).f
If
4.5
CHOMSKY NORMAL FORM
|
93
a=>& then a|>£. Thus L(G )^L(G 2 ). Now we show by induction on the number of steps in a derivation that if A => w, for AinV and w in T*, then A=>w. The result is trivial for one-step derivations. Suppose that it is true for derivations of up to k steps. Let A => w be a (k + l)-step derivation. The first step must be of 1
A -> B B 2
the form
< <
1
i
If
there
=
•
•
Bm m >
•
2.
,
x
We can
write
w = w w2
wm where B ,
x
=> (
w
i9
m.
Ca
is
some terminal a h then
for
.
B
X X
A^>
a production
is
-
l
2
--
vv f
Xm
must be a By the construction of P\ P where X = if is in V and { .
of
£
For those £, in V, we know that the derivation B i(f> w takes no more than k steps, so by the inductive hypothesis, X §> w Hence A*> w. We have now proved the intermediate result that any context-free language can be generated by a grammar for which every production is either of the form A -> a or A -> B x B 2 Bm for m > 2. Here A and B u B 2 Bm are variables, and a is a terminal. Consider such a grammar G 2 = (V\ T, F, S). We modify G 2 by adding some additional symbols to V and replacing some productions of F. For each production A-* B B 2 Bm of F, where m > 3, we create new variables D u Z) 2 D m _ 2 and replace A-* B B 2 -Bm by the set of productions a
t
if
t
in
is
V"
V.
t
t
'
-
•
•
,
,
. .
t
.
.
,
-
l
,
X
{A^ B D U D ^ B 2 D 2 X
x
D m . 3 ^ Bm . 2 D m _ 2 Dm _ 2 -> Bm _ Bm
,
x
,
new nonterminal vocabulary and P"
}.
new set of productions. Let if A§>p, then /4 1> /?, so L(G 2 ) ^ L(G 3 But it is also true that L(G 3 ) ^ L(G 2 as can be shown in essentially the same manner as it was shown that Ufii) £ The proof will be left Let K" be the
G 3 = (V\
T, P", S).
G3
is
in
CNF.
It
is
the
clear that
).
),
to the reader.
Example
4.9
Let us consider the
grammar
({S,
A, £},
{a, b], P,
S) that has the
productions:
S^bA\aB A->bAA\aS\a
B^aBB\bS\b and find an equivalent grammar
in
CNF.
A a and B^b. There are no unit productions, so we may begin by replacing terminals on the right by variables, except in the case of the productions A -> a and B -> b. S -> bA First, the
is
only productions already in proper form are
replaced by S ->
Cb A
and
Cb
b.
Similarly,
A - aS is replaced
by A
C a 5 and
A-+Cb AA; S -> aB is replaced by S^C a B; B-*bS is replaced by £ - Cb S, and £ - a££ is replaced by B - C a BB. In the next stage, the production A-+Cb AA is replaced by A -> C^Dj and D -* A A, and the production B^Ca BB is replaced by B - C a Z) 2 and D 2 -> ££. Ca-*a;
i
A^bAA
is
replaced by
94
CONTEXT-FREE GRAMMARS
The productions
grammar
for the
CNF are
in
shown below.
S->C b A\C a B
D ->AA
D
D 2 ->BB
i4-^Ca S|C
fc
X
|fl 1
B-+C b S\C a D 2 \b
C a ->a
C b -+b
4.6
GREIBACH NORMAL FORM
We now develop a normal-form
theorem that uses productions whose right-hand symbol perhaps followed by some variables. First say we can modify the productions of a CFG in certain
sides each start with a terminal
we prove two lemmas ways without
Lemma left.
G=
B Pi Pi tained from 1
'
(V, T, P, S) • * |
1
productions
language generated.
Define an A-production to be a production with variable
4.3
Let
that
affecting the
G A
pr be the
CFG.
be a set
of
all
Let
A
-»
ol
x
Bol 2
be a production
G =
^-productions. Let
{
^1 a2
|o
a2
I'*"
|
a i/?r<*2- Then L(G)
P u S) be P and adding
(V, T,
by deleting the production /4->a 1 Ba 2 from -> a 1
A on the P and
in
obthe
= L^).
Proof Obviously L(Gi) c L(G), since if ^4 —> o^fta^ s usec n a derivation of G lt then A^>ct 1 Ba 2 ^>0LiP (X2 can be used in G. To show that L(G)^L(G l \ one simply notes that A -a 1 2fa 2 is the only production in G not in G x Whenever A -+ a 1 B(x 2 is used in a derivation by G, the variable B must be rewritten at some later step using a production of the form B /?,-. These two steps can be replaced by the single step A => a /?,a 2 ^
*
*
i
.
1
Lemma
4.4
Let
G=
.
(K, T, P, S)
A
be a
CFG.
Let
A
-> A(x l \Aol 2 \'-\ Avl t
be the
set
symbol of the right-hand side. Let A -> Pi \P 2 \'" \PS be the remaining /1-productions. Let G, = (Fu {£}, T, P l9 5) be the CFG formed by adding the variable B to V and replacing all the Aproductions by the productions: of ^-productions for which
^ A
~*Pi
1)
Then L{G
\<
1<
„
i
)
=
is
^
2)
the leftmost
B-*OLi
I
1<
L(G).
Proof In a leftmost derivation, a sequence of productions of the form A -> AoL must eventually end with a production A -» fij. The sequence of replacements
A=>
Aa. ix => AcL i2 CL h =>--'=> AcL ip CL ip _
x
a f|
t
4.6
in
G
can be replaced
The
G
by
x
Thus L(G) = L(G,). Figure 4.8 where we see that in G, a chain of
reverse transformation can also be made.
shows i4's
in
GREIBACH NORMAL FORM
|
this
transformation on derivation
extending to the
left is
G
replaced in
t
trees,
by a chain of Fs extending to the
A
A
(b)
(a)
Fig. 4.8
Theorem without
A
form
Transformation of
Lemma
4.4
on portion of a derivation
tree.
Every context-free language L can be generated by a grammar for which every production is of the empty) aoi, where A is a variable, a is a terminal, and a is a (possibly
4.6 (Greibach normal e
right.
form or
GNF)
string of variables.
Let G = (V, T, P, S) be a Chomsky normal form grammar generating the CFL L Assume that V = {A u A 2 ,..., A m The first step in the construction is to Starting modify the productions so that if A -+ A y is a production, then j > }
Proof
}.
i.
{
A and
with
x
proceeding to
A m we do ,
this as follows.
We assume
that the produc-
have been modified so that for I
tions
{
j
>
i.
We now
by substituting for
A}
the right-hand side of each ,4,-production according to
-
productions 4.3. By repeating the process k 1 times at most, we obtain replaced accordof the form A k -» A y, £ > = then £ are productions with k The e ing to Lemma 4.4, introducing a new variable B The precise algorithm is given in k
Lemma
L
.
Fig. 4.9.
CONTEXT-FREE GRAMMARS
96
begin for
1)
1
to
m
do
begin for
2)
1
-
to k
1
do
for each production of the
3)
form
Ak
do
AjOL
begin
Aj -> P do add production A k -+ /fa; remove production A k -* Ajcc for all productions
4) 5)
6)
end; for each production of the
7)
form
A k -> A k
tx
do
begin
add productions Bk remove production
8)
9)
£k Ak a
a and
Ak
a£ k
;
end;
each production
for
10)
begin with
/l k
-+
add production A k
Hj
where
/?
does not
A k do -* /?£k
end end Fig. 4.9
By
Step
1
in the
Greibach normal-form algorithm.
repeating the above process for each original variable,
we have only
pro-
ductions of the forms:
A^Ajy,
1)
ay,
2) Ai
B^y,
3)
j>U a in T,
yin(Ku{B B 2f lf
...
f
B_ l
1
})*
Note that the leftmost symbol on the right-hand side of any production for A m must be a terminal, since A m is the highest-numbered variable. The leftmost symbol on the right-hand side of any production for A m . x must be either A m or a terminal symbol. When it is A m we can generate new productions by replacing A m by the right-hand side of the productions for A m according to Lemma 4.3. These productions must have right sides that start with a terminal symbol. We then ,
proceed to the productions for A m __ 2 A 2 A x until the right side of each production for an A starts with a terminal symbol. ,
,
x
we examine the productions for the new variables, B2 we began with a grammar in Chomsky normal form, it is easy to prove by induction on the number of applications of Lemmas 4.3 and 4.4 that the As
Bm
.
the last step
,
Since
right-hand side of every ^-production, for
some j and k. Thus a in line
(7)
1
< < i
n,
begins with a terminal or
of Fig. 4.9 can never be
AjA k
empty or begin with some
\
4.6
GREIBACH NORMAL FORM
|
97
Bj, so no £ r production can start with another Bj. Therefore all ^.-productions have right-hand sides beginning with terminals or A-s, and one more application of Lemma 4.3 for each ^-production completes the construction.
Example
4.10
G=({A U A 29 A 3 } where
P
grammar
Let us convert to Greibach normal form the
b},P,A x \
{a,
9
consists of the following:
Ai —> A 2 A 3
A 2 -*A 3 A A3 Step
->
\b
A A 2 \a 1
A
Since the right-hand side of the productions for
1
terminals
A3
-*
t
or
higher-numbered
A A 2 and x
The
A2 A3
substitute the string
only production with
A on
the
x
we
variables,
for
A
t
.
v
Note
that
A2
and
with
begin
the
A
start
A2 A3
x
with
production is
the
left.
resulting set of productions
A
x
is:
A2 A3
-»
A 2 -+A 3 A
x
\b
- A2 A
3
A 2 \a
A3
Since the right side of the production
A3
-*
A2 A3 A2
begins with a lower-
numbered variable, we substitute for the first occurrence of A 2 both A 3 A and b. Thus A 3 -> A 2 A 3 A 2 is replaced by A 3 -* A 3 A A 3 A 2 and A 3 -> bA 3 A 2 The new x
.
x
set is
A
X
^A
2
A3
A 2 -+A 3 A
A3
We now
apply
Lemma
A3
->
B3
is
\b
A 3 A A 3 A 2 \bA 3 A 2 \a x
4.4 to the productions
A3 Symbol
-+
x
-+
A 3 A A 3 A 2 \bA 3 A 2 \a. x
A 3 ^> A 3 A A 3 A 2 is replaced by A A 3 A 2i and B 3 -* A A 3 A 2 B 3 The resulting
introduced, and the production
bA 3 A 2 B 3 A 3 ,
-*
aB 3 B 3
-*
set is
A -+A 2 A 3 X
A 2 -+ A 3 A A3
->
x
\b
bA 3 A 2 B 3 \aB 3 \bA 3 A 2 \a
B 3 -+A A 3 A 2 \A A 3 A 2 B 3 X
X
x
x
,
X
.
98
CONTEXT-FREE GRAMMARS
Now
A 3 on the left have right-hand sides that These are used to replace A 3 in the production A 2 -* A 3 A x and then the productions with A 2 on the left are used to replace A 2 in the
Step 2
the productions with
all
start with terminals.
production
A
x
A 2 A 3 The
->
.
result
is
the following.
A3
-*
bA 3 A 2 B 3
A 3 ->bA 3 A 2
A3
->
0# 3
y4 3
A 2 -*bA 3 A 2 B 3 A A2
-+
aB 3 A
-* a
A 2 -+bA 3 A 2 A
l
A 2 -^aA
l
l
l
A 2 -+b A -*bA 3 A 2 B 3 A A 3
A -±bA 3 A 2 A A 3
A -+aB 3 A A 3
A -+aA A 3
l
l
l
l
l
Ai
-*
bA 3
B3
-»
A A3 A2
x
x
B3
l
l
The two 5 3 -productions are converted more productions. That is, the productions
^A
1
A3 A2
B3
and
X
to proper form, resulting in ^)
Step 3
B3
A A 3 A 2 B3
-»
A A 3 A 2 B3
^>
l
are altered by substituting the right side of each of the five productions with the
left
for the first occurrences of
A v Thus £ 3
B 3 -+bA 3 A 2 B 3 A A 3 A 3 A 2 x
^bA 3 A 3 A 2l
B3
->
The other production
for
B3
B3
B3
,
-*
x
aB 3 A A 3 A 3 A 2 l
B3
,
replaced similarly.
.
aA A 3 A 3 A 2
-+
l
A3
A 3 -+bA 3 A 2
^aB 3
/4 3
A 2 -*bA 3 A 2 B 3 A A2
^aB
A2
^b
3
A
l
->fl
^bA 3 A 2 A A 2 ^aA A2
^bA 3 A 2 A A 3
A ^>bA 3 A 2 B 3 A A 3
A
A -+aB 3 A A 3
A -+aA A 3
l
l
x
l
x
x
l
l
l
l
l
A -+bA 3 x
B 3 -> bA 3 A 2 B 3 A A 3 A 3 A 2 B 3 x
B3
-*
.
The final set of productions
is
A 3 -+bA 3 A 2 B 3
x
l
bA 3 A 2 A A 3 A 3 A 2 is
A on
A A 3 A 2 becomes
bA 3 A 2 B 3 A A 3 A 3 A 2 x
4.7
INHERENTLY AMBIGUOUS CONTEXT-FREE LANGUAGES
|
B 3 ->aB 3 A A 3 A 3 A 2 B 3
B 3 -+aB 3 A A 3 A 3 A 2
B 3 ->bA 3 A 3 A 2 B 3
B 3 -+bA 3 A 3 A 2
B 3 ->bA 3 A 2 A A 3 A 3 A 2 B 3
B3
£3
B 3 ->aA A 3 A 3 A 2
1
l
l
aA A 3 A 3 A 2 B 3
->
99
1
bA 3 A 2 A A 3 A 3 A 2
->
l
l
THE EXISTENCE OF INHERENTLY AMBIGUOUS CONTEXT-FREE LANGUAGES
4.7
It is
easy to exhibit ambiguous context-free grammars. For example, consider the
grammar with productions S do
is
-> A,
A -» a, and B
-* B,
S
What
fa.
to exhibit a context-free language for which every
is
not so easy to
CFG is ambiguous. In this
we show that there are indeed inherently ambiguous CFL's. The proof is somewhat tedious, and the student may skip this section without loss of continuity. The existence of such a language is made use of only in Theorem 8.16. We shall show that the language section
L=
{fl"ft"cV"
|
>
n
1,
m>
1}
u {aV^d"
\
n
>
ambiguous by showing that infinitely many must have two distinct leftmost derivations. establishing two technical lemmas. n
,
n
>
finite
4.5 Let (N h M,), or infinite.) Let
If
i
=
{(n,
S
=
r,
be pairs of sets of integers. (The
m) n |
N m
in
in
i9
form
proceed by
sets
first
may
be
MJ
let
each pair of integers
for all
but some
Proof
Assume
are infinitely (n,
< <
1
Si
and
We
1,
Lemma
1}
strings of the
inherently
is
fl"bVrf
m>
1,
n)
is
not
(«,
m)
is
finite set
of
n.
that for all n
many
in 5.
(n,
f
will
be
for all m
n)
is
not
Sr
.
and m, where n
m, each
(n,
m, then
=f
m)
is
infinite,
and
for Si
Jr _
x
2
each n and
u
5 £+1
u
•'
m
•••
2
in
u
(h,
n)
is
in
S
and that there n such that
all
Ji such that
,
2
in 5,
Let J be the set of
in S.
construct a sequence of sets J r J r -u
Jr Each J
S
and m, where n
n such that
We
in
u
u S2 u
5,
Ji-
J„
5r
(n,
m)
is
not in
.
M
For /I in J, either n is not in Nr or n is not in otherwise (n n) would be in r Sr and hence in S. Thus there is an infinite subset of J, call it J r such that either for all n in J Now for n and m in J r n is not in Nr9 or for all n in J r n is not in r r m) is not in S r ;
y
,
,
,
.
M
.
,
CONTEXT-FREE GRAMMARS
100
Assume
J r9 J r _
that
...,J i+l have been constructed, where
ly
i
N
Then
1.
M
constructed as follows. For each n in J i+ u either n is not in or not in otherwise (n, n) would be in S, and hence in 5, a contradiction since J i+l ^ J.
J
is
t
Thus, either an
infinite subset
M,. In either case,
in
let
of J i+
is
x
not
in JV,
the infinite subset be
J
f
.
t
or an infinite subset of J i+
j
is
Now for all n and m in J h (n,
£ ;
not
m)
is
and hence not in 5 u S i+ u ••• u Sr Since J x contains an infinite number of elements, there exist n and m in J u m. Now (rc, m) is not in Sj u 5 2 u u 5 r = 5, contradicting the assumption that all (n, m), where w m, are in 5. Thus (n, n) is in S for all but some finite set
not
in Si
.
x
f
•
of
• •
n.
Lemma
G
Let
4.6
be an unambiguous
CFG G equivalent
an unambiguous
productions, and for every variable
we have Proof
the derivation
A
CFG. Then we
A
,
4.1
and
4.2,
productions, cannot convert an unambiguous
Theorem because
A
4.4,
if
we
B and B
^>
useless
are not both
{
The construction of Lemmas
no
symbols or
other than possibly the start symbol of G',
Ax 2 where x and x 2
since the set of derivation trees for
can effectively construct
to G, such that G' has
removing
grammar
useless
into an
e.
symbols and
ambiguous
ojje,
words does not change. The construction of
removing unit productions, cannot introduce ambiguities. This is introduce production A -> a, there must be a unique B such that -> a is a production, else the original grammar was not unambig-
uous. Similarly, the construction of
Theorem
4.3,
removing e-productions, does
not introduce ambiguity.
G has no useless symbols or productions, no no unit productions. Suppose that for no x and x 2 not both c Then replace each occurrence of A on the right side of any the right sides of ^-productions. As there are no unit produc-
Let us therefore assume that (-productions, and
does
A^>x Ax 2 x
production by
.
all
x
tions, ^-productions
or useless symbols, there cannot be a production
A
A
ctxA^,
x 2 not both c. The above change does not modify the generated language, by Lemma 4.3. Each new production comes from a unique sequence of old productions, else G was ambiguous. Thus the resulting grammar is unambiguous. We see that A is now useless and may be eliminated. After removing variables violating the condition of the lemma in this manner, the new grammar is equivalent to the old, is still unambiguous, and satisfies the lemma. else there
Theorem
is
a derivation
The CFL,
4.7
L= is
^> x 1 y4x 2 , with x x and
n
n
{a b (Td
m \
n
>
1,
m>
1}
u
n
{a b
m
(Td
n \
n
>
1,
m>
1},
inherently ambiguous.
Assume that there is an unambiguous grammar generating L By Lemma we can construct an unambiguous grammar G = (K, 7, P, S) generating L with no useless symbols, and for each A in V - {S}, A ^>x Ax 2 for some x and x 2 in
Proof 4.6
l
T*, not both
e.
l
:
4.7
We 1) If
INHERENTLY AMBIGUOUS CONTEXT-FREE LANGUAGES
|
grammar G has
note that the
A ^> x Ax 2 l
,
101
the following properties
then x x and x 2 each consist of only one type of symbol
(a, b, c,
or d); otherwise
w^w^ ^> WiX^i Ax 2 x 2 w 3 ^ w some w x w 2 and
for
2) If
vv 3
,
,
A^x^Ax^
This
A ^>x Ax 2
L
l
more 4) If
,
l
terminal string
last
we could
without increasing the
then |xj
=
|
|x 2
|.
a
L
in
having
of one symbol than any other.
A^>x Ax 2 and A^>x z Ax x
Ar
,
then Xj and x 3 consist of the same type of
A^>x Ax 2 x
,
above would be violated.
1
then either
ds and x 2 of 6's and x 2 of cs and x 2
y
solely of b s or of
b) x x consists solely
solely of cs, or
c) Xj consists solely
In any of the other cases
it
is
solely of d's.
5 can be divided into four
Cab
in
the set of
is
C cd
Cad, Cbc> anc*
A
all
V such
d\
easy to derive a string not
variables other than
or
in
L
a) Xj consists solely of
A
in L.
Otherwise we could find words
.
6)
not
number of one type of symbol in number of any other type of symbols,
symbol. Likewise x 2 and x 4 Otherwise Property 5) If
is
,
increase the
thereby generating a sentence not in 3) If
x l x w2 x 2 x2 w 3
then x x and x 2 consist of different symbols. Otherwise,
derivation involving A, a sentence of
.
l
that
classes,
A^>x Ax 2 l
,
L
in
Cab Cad C bc ,
with Xj
in
Thus
the
and C cd a* and x 2 in b*; ,
,
.
are defined analogously.
C ab or C cd cannot contain a symbol in C ad C bc or vice versa. Otherwise, we could increase the number of three types of
derivation containing a symbol in
symbols of a sentence
in
L without
increasing the
of occurrences of one type of symbol
number of the
L
fourth type of
which the number smaller than that of any other.
symbol. In that case, there would be a sentence is
in
for
We now note that if a derivation contains a variable in C ab or C cd then the m m terminal string generated must be in {a n b n c d \n > 1, m > 1}. For assume that A n n m m in C \n> 1, m > 1}. Then ab appears in a derivation of a sentence x not in {a ^ c ^ x must be of the form a n b m cm d n m £ n. Since A is in C ab a sentence a n + p b m + p cm d n ,
,
,
*ni=
h, for
some p >
argument holds
m
tains B {fl
a
We
for
variable B
0,
A
could be generated. Such a sentence in
C cd
.
or
in
Similar reasoning implies that
Cbc
,
then
the
G
into
two grammars,
Gj
=
({S} yj
C ab u C cd
,
T,
P„
,
T,
P 29 S%
S)
and
G2 =
({S}
vC^u
Cbc
not
if
in
L A similar
a derivation con-
sentence generated
|n>U>l}.
divide
i
is
must be
in
102
CONTEXT-FREE GRAMMARS
where
P
contains
1
the right or
Cbc
on
->
productions of P with a variable form
P2
contains left.
all
or
Ccd on
either
productions of P with a variable from
In addition,
P
contains
x
all
productions
Cad or from P of
- aW^cF", nj=m, and P 2 contains all productions from P of the form
S
-+ a"b n c"d n are not in either
aVc^d", nj=m. Productions of P of the form S
P2
or
all
and
either the right or
the form
S
left,
P
x
.
Since
G
generates n
n
{a b crd
Gj must generate
all
m |
>
n
1,
m>
\
>
n
1,
m>
1},
sentences in n
n
{a b d"d
plus possibly
u {aWcfd"
1}
some sentences
m
\n>Um>l
in {a"b"c"d"
\
n
>
nj=m}
y
and
1},
G2
must generate
all
sen-
tences in n {fl
plus possibly
&'W
some sentences
|
n
>
m>
1,
in {a"b"c"d" n \
>
n
1,
1}.
£
m}
We now show that this cannot be
Gj and G 2 both generate all but a finite number of sentences in {aVcfd" n > 1}. Thus all but a finite number of sentences in {a"b n c"d n \n>l} are generated by both G x and G 2 and hence by two distinct derivations in G. This contradicts the assumption that G was unambiguous. To see that G and G 2 generate all but a finite number of sentences in n n n n {a b c d n > 0}, number the productions in P of the form S -* a from 1 to r. For < r, if S -* a is the ith production, let N, be the set of all h such that 1 < the case unless |
x
x
|
i
G for
some
m, and
let
M
£
be the
G
i
i
set of all
m
such that
n n m m b c d S=xx=>a G Gi i
for
some
n.
We leave
it
to the reader to
show
S=xx*>a Gi
G,
[Hint: Recall that the variables of a are in
Lemma {a
n
4.5
b"c"d"\n>
that
G
x
must generate
all
n
that for
m m b c d
any n
in
N
t
and any m
in
Mh
n
.
Cad or C cd but a
.]
It
finite
follows immediately from
number
of sentences in
1}.
A similar argument applies to G 2 The reader can easily show that G 2 cannot have a right side with two or more variables. We number certain productions and pairs of productions in a single ordering. Productions of the form S-+
>
N
Also
let
M, be
the set of
m
such that for some
n,
t
EXERCISES
103
The pair of productions S -> a and A -> o^Bo^ will receive a number if a contains a variable in C^, ^ is in Cad and £ is in Cbc If this pair is assigned the number i, then define AT to be the set of n such that for some m, .
,
f
x^x 2 => x
S => a
M,
Also define
m
to be the set of
t
a t £a 2 x 2 ^>
aVcV.
such that for some
n,
S=>ol^> x x Ax 2 => x t a t Ba 2 x 2 ^> a b m (Td n n
Once
again, for any n in
N
f
and
m
.
Mh
in
S?>anbm cmd\ and thus
it
follows from n
n
sentences in {a b (fd
n
n
\
Lemma 4.5 that G 2 generates all but a finite number of > 1}. We conclude that for some n, an b"cn dn is in both
Ufii) and L(G 2 ). This sentence has two leftmost derivations
in G.
EXERCISES 4.1 3 a)
Give context-free grammars generating the following
The
set
sets.
of palindromes (strings that read the same forward as backward) over alphabet
{a. b}.
b)
The
set of all strings of
and
right parenthesis * c)
d)
The The
set of all strings set
of
balanced parentheses,
pairs of
over alphabet
expressions over alphabet
and
string * e) f)
4.2
The
set
of
{a, b}
strings over alphabet {a, b,
all
as a
symbol
all strings
in
i.e.,
each
left
parenthesis has a matching
matching parentheses are properly nested.
{a, b).
Note
with exactly twice as
+,
•,
that
*,
{a, b}
),
0}
e,
many
a's
We
as
b's.
that are well-formed regular
we must distinguish between
a regular expression.
over alphabet
(,
c
as the
empty
use e in the latter case.
not of the form
ww
for
some
string w.
{aWli^or;^*}. Let
G
be the grammar
S-+aS\aSbS\e. Prove that L(G) 4.3
For
i
Construct a
>
{x each prefix of |
t
Construct a
CFG
#}
+
many
+
is
-{b #b 2 #
•••
l
|w
the binary representation oft.
#b n \n>
in (0 4- 1)
1}.
+
}*.
The grammar
E-*E + E\E* generates the set of arithmetic expressions with
ambiguous since a)
1)* that
a's as b's}.
generating the set
{w#w K # 4.5
at least as
generating {0, 1,
4.4
x has
b denote the string in 1(0
1, let
CFG
—
id
+
id * id
£|(£)|id 4-
,
*,
parentheses and
id.
The grammar
can be generated by two distinct leftmost derivations,
Construct an equivalent unambiguous grammar.
is
CONTEXT-FREE GRAMMARS
104
unambiguous grammar
b) Construct an parentheses. expression, * 4.6
A
e.g.,
no redundant
for all arithmetic expressions with
redundant if its removal does not change the the parentheses are redundant in id + (id * id) but not in (id 4- id) * id. parentheses
set of
is
G is a CFG with m variables and no right side of a production longer than £. m t — 1 — steps by which A A => t, then there is a derivation of no more than
Suppose
Show
that
derives
if
How
c.
Show
* 4.7
w
t,
Let
4.8
CFG G
that for each
w
then
G
bound can you
close to this
there
is
1
come?
actually
a constant c such that
if
w
is
in
L(G), and
has a derivation of no more than c\w steps. \
be the grammar
S-+aB\bA
A->a\aS\bAA B-+b\bS\aBB For the
aaabbabbba
string
* 4.9
Is the
CFG
Find a
4.10
4.1
grammar
Suppose
1
G
a
c) parse tree.
unambiguous?
in Exercise 4.8
with
is
find a
b) rightmost derivation,
a) leftmost derivation,
no
CFG
symbols equivalent to
useless
S->AB\CA
B-*BC\AB
A-*a
C->aB\b
and w, of length
/, is in
L(G).
How long is a derivation of w in G
if
a)
G
4.12
is
in
Let
CNF
G
G
b)
be the
CFG
is
GNF.
in
generating well-formed formulas of propositional calculus with
predicates p and q:
S->~S\[S^S]\p\q. The
terminals are
p, q,
~,
[, ],
and
=>.
Find a
Chomsky normal-form grammar
generating
L(G). 4.13
4.14
that conversion to Chomsky normal form can square the number of producgrammar. [Hint: Consider the removal of unit productions.]
Show
tions in a
Find a Greibach normal-form grammar equivalent to the following
CFG:
S^AA\0 A->SS\\ 4.15
Show
that every
are of the form
A
=
=
that every
CFL
productions, then
*S 4.16
Show
of the form
CFL without A~* BC
-* a, or
A
-> a,
A
-* a£,
e
e
or a 2
=
without
and A
can be generated by a
where
t
-*
yj
e is
=
B^C
and
CFG all of whose productions
A
-* ol^Bolj
and A
-*
y^By 2 are
£.
generated by a
aBC.
if
CFG
all
of whose productions are
EXERCISES
of the form
*
that every
CFL without
A
A
Show
4.17
and
-> a
Can
every
CFL without
forms
A
BCD
and
4.19
Show
a regular
that
is
all
of
whose productions are
be generated by a
e
CFG all of whose productions are of the
a?
productions of a
An
said to be linear
Which
of a variable.
**S 4.21
CFG
CFG are of the form A -* wB or A
operator
if
no
grammar
The algorithm given
is
not the most
if
n
the
is
** 4.23
is
CFG
a
is
more than one instance grammars?
right side of a production has
of the languages of Exercise 4.1 have linear
with
no c-productions such
CFL without e has
determine which variables derive terminal strings Give a computer program to perform the task in 0(n) steps
in Fig. 4.7 to
efficient possible.
sum
no consecutive
that
symbols on the right sides of productions are variables. Show that every an operator grammar. ** 4.22
w, then L(G)
set.
A CFG
** 4.20
if all
generated by a
aab.
4.18
A
t is
105
of the length of
all
the productions.
\i±j and j ± k and k ± i} a CFL? [Hint: Develop a normal form similar to Theorem 4.7. (A pumping lemma is developed in Section 6. 1 that makes exercises of this type much easier. The reader may wish to compare his solution to that in Example 6.3).] Is {a'Vc*
that in
Solutions to Selected Exercises 4.1
of
a) The definition of "palindrome," a string reading the same forward as backward is no help in finding a CFG. What we must do in this and many other cases is rework the
definition into a recursive form.
We may
define palindromes over
{0, 1}
recursively, as
follows: 1) e, 0,
w
and
are palindromes;
1
a palindrome, so are 0h>0 and lwl;
2)
if
3)
nothing else
We
is
proved
palindromes
4.16
Let
is
a palindrome.
in Exercise
now
G=
1.3
that this
a valid definition of palindromes.
is
follows immediately from (1) and
S-0|l|c
(from
1);
S->0S0|1S1
(from
2).
production
A
— gol in
where a or
ft
is 6, [e] is
4.21
Let
assume
all
is
production
A
production
B -*
|
L By Exercise 4.16 we may aB and A -> aBC. First replace each aBC by A a[BC\ where [BC] is a new variable. After having replaced all the form A -* aBC, then for each newly introduced variable [BC\ B-
(K, T, P, 5)
-*
is
deleted from the right side of the production.
be a
GNF
productions are of the form
productions of
a,
and C-production
C
grammar
A
ft
a,
A
generating -»
add production [BC] -* a/?. Note that a and p where E may be either a new or old variable.
are either single terminals or of the form bE,
The
resulting
for
GNF grammar generating L Suppose k the length of the V = {[a] |a in K + and |a < k}. For each P and each variable [Aft] in V place [Aft] -> a[a][p] in P\ In the case
(K, T, P, 5) be a
longest right side of a production of G. Let
G=
A CFG
(2). It is:
grammar
is
an operator grammar equivalent to the
original.
CONTEXT-FREE GRAMMARS
106
BIBLIOGRAPHIC NOTES grammar formalism is found in Chomsky [1956]; important by Chomsky on the subject appear in Chomsky [1959, 1963]. The related Backus-Naur form notation was used for the description of ALGOL in Backus [1959] and Naur [I960], The relationship between CFG's and BNF was perceived in Ginsburg and
The
origin of the context-free
later writings
Rice [1962].
Chomsky normal form
is
based on
Chomsky
[1959]. Actually,
Chomsky proved
the
stronger result stated in Exercise 4.15. Greibach normal form was proved by Greibach [1965].
The method
of proof used here
is
due to M. C.
Paull.
The reader should
consider the algorithm of Rosenkrantz [1967], which has the property that
it
also
never more
number of variables, while the algorithm of Theorem 4.6 may exponennumber. Solutions to Exercises 4.16, 4.17, and 4.21 can be found there as well. Ambiguity in CFG's was first studied formally by Floyd [1962a], Cantor [1962], Chomsky and Schutzenberger [1963], and Greibach [1963]. Inherent ambiguity was studied by Gross [1964], and Ginsburg and Ullian [1966a, b]. Important applications of context-free grammar theory have been made to compiler design. See Aho and Ullman [1972, 1973, 1977], Lewis, Rosenkrantz, and Stearns [1976], and the bibliographic notes to Chapter 10 for a description of some of the work in this area. Additional material on context-free languages can be found in Ginsburg [1966] and Salothan squares the tiate the
maa
[1973].
CHAPTER
5
PUSHDOWN AUTOMATA
5.1
INFORMAL DESCRIPTION
ton, the context-free
grammars have
automaton. Here the equivalence
down automaton
is
is
their
—
automaton the finite automamachine counterpart the pushdown
Just as the regular expressions have an equivalent
somewhat
—
less satisfactory, since
the push-
a nondeterministic device, and the deterministic version
accepts only a subset of all CFL's. Fortunately, this subset includes the syntax of
most programming languages. (See Chapter 10 for a detailed study of deterministic pushdown automaton languages.) The pushdown automaton is essentially a finite automaton with control of both an input tape and a stack, or "first in-last out" list. That is, symbols may be entered or removed only at the top of the list. When a symbol is entered at the top, the symbol previously at the top becomes second from the top, the symbol previously second from the top becomes third, and so on. Similarly, when a symbol is removed from the top of the list, the symbol previously second from the top becomes the top symbol, the symbol previously third from the top becomes second, and so on. A familiar example of a stack is the stack of plates on a spring that one sees in cafeterias. There is a spring below the plates with just enough strength so that exactly one plate appears above the level of the counter. When that top plate is removed, the load on the spring is lightened, and the plate directly below appears above the level of the counter. If a plate is then put on top of the stack, the pile is pushed down, and only the new plate appears above the counter. For our purposes, we make the assumption that the spring is arbitrarily long, so we may add as
many
plates as
we
desire.
107
PUSHDOWN AUTOMATA
108
Such a stack of plates, coupled with a finite control, can be used to recognize a R set. The set L = {wcw w in (0 + 1)*} is a context-free language, gen-> erated by the grammar S 050 1S1 c. It is not hard to show that L cannot be accepted by any finite automaton. To accept L, we shall make use of a finite control with two states, q x and q 2 and a stack on which we place blue, green, and red plates. The device will operate by the following rules.
nonregular
\
1
\
,
1)
The machine in state
qx
starts with
plate
control
is
stack. If the input to the device
is
.
is c and the device added or removed.
3) If the input
plates are
4) If the input 0,
finite
0 and the device is in state q l9 a blue plate is placed is 1 and the device is in state q u a green placed on the stack. In both cases the finite control remains in state q x
2) If the input to the device
on the
one red plate on the stack and with the
.
is
0 and the device
on top of the
is
is
stack, the plate
in state
in state
is
qu
changes state to q 2 while no
it
q 2 with a blue plate, which represents If the input is 1 and the device is in
removed.
which represents 1, on top of the stack, the plate removed. In both cases the finite control remains in state q 2
is
q 2 and a red plate is on top of the stack, the plate for the next input.
is
state q 2 with a green plate,
.
5) If the device
is
in state
removed without waiting 6)
For
all
cases other than those described above, the device can
The preceding
rules are
We say that
summarized
make no move.
in Fig. 5.1.
above accepts an input string if, on processing the last symbol of the string, the stack of plates becomes completely empty. Note that, once the stack is empty, no further moves are possible. the device described
way. In state q u the device input by placing a blue plate on top of the stack of plates
Essentially, the device operates in the following
makes an image of
its
each time a 0 appears input.
When
compared
in the input,
and a green plate each time a
1
appears
in the
the input, the device enters state q 2 Next, the remaining input is with the stack by removing a blue plate from the top of the stack each
c
is
.
time the input symbol is a 0, and a green plate each time the input symbol is a 1. Should the top plate be of the wrong color, the device halts and no further processing of the input plate at the
and the device
when
possible. If all plates
is
is
exposed.
match the
The
inputs, eventually the red
red plate
is
immediately removed
said to accept the input string. All plates can be
the string that enters the device after the c
before the
5.2
is
bottom of the stack
is
removed only what entered
the reverse of
c.
DEFINITIONS
pushdown automaton (PDA). The PDA and a stack. The stack is a string of symbols from some alphabet. The leftmost symbol of the stack is considered to be at the
We shall now will
formalize the concept of a
have an input
tape, a finite control,
5.2
|
DEFINITIONS
109
Input
Top
_____ State
plate
"
0
Add
Blue
c
1
Add
blue plate;
stay in state
Qi-
Qi-
Remove
Go
green plate;
stay in state
to
state q 2
top
.
—
plate; stay in state q 2 .
Add
Green
Add
blue plate;
stay in state
Go
green plate;
stay in state
to
state q 2
.
Remove top plate; stay in state q 2
Red
q\
Add
Add
blue plate;
Fig. 5.1
Without waiting
Finite control for
Go
green plate;
stay in state
stay in state
<\2
.
for next input,
to
state q 2
remove top
pushdown automaton accepting {wcw R w |
.
plate.
in (0
+
1)*}.
The device will be nondeterministic, having some finite number moves in each situation. The moves will be of two types. In the first type of move, an input symbol is used. Depending on the input symbol, the top symbol on the stack, and the state of the finite control, a number of choices are possible. Each choice consists of a next state for the finite control and a (possibly "top" of the stack. of choices of
empty) string of symbols to replace the top stack symbol. After selecting a choice, the input head is advanced one symbol. The second type of move (called an £-move) is similar to the first, except that the input symbol is not used, and the input head is not advanced after the move. This type of move allows the PDA to manipulate the stack without reading input symbols. Finally,
we must
define the language accepted by a
pushdown automaton.
which we have already seen, is to define the language accepted to be the set of all inputs for which some sequence of moves causes the pushdown automaton to empty its stack. This language is referred to as the language accepted by empty stack. The second way of defining the language accepted is similar to the way a finite automaton accepts inputs. That is, we designate some states as final states and There are two natural ways to do
this.
The
first,
PUSHDOWN AUTOMATA
110
define the accepted language as the set of all inputs for which
some choice of moves causes the pushdown automaton to enter a final state. As we shall see, the two definitions of acceptance are equivalent in the sense that if a set can be accepted by empty stack by some PDA, it can be accepted by final state by some other PDA, and vice versa. Acceptance by final state is the more common notion, but it is easier to prove the basic theorem of pushdown automata by using acceptance by empty stack. This theorem is that a language is accepted by a pushdown automaton if and only a context-free language.
if it is
M
A
pushdown automaton
1)
Q
is
a
2)
T
is
an alphabet called the input alphabet;
3)
T
is
an alphabet, called the stack alphabet;
Q
5)
Z0
T
6)
F^Q
7) 3
in
is
y
Z0
,
F),
where
the initial state;
is
a particular stack symbol called the start symbol;
the set of final states;
is
a mapping from
is
a system (Q, Z, T, 3 q Qy
of states;
finite set
4) q 0 in
is
Q
x (T
Unless stated otherwise,
we
u
{e})
x
T
to finite subsets of
Q
x T*.
use lower-case letters near the front of the
alphabet to denote input symbols and lower-case letters near the end of the
alphabet to denote strings of input symbols. Capital letters denote stack symbols
and Greek
letters indicate strings
of stack symbols.
Moves The
interpretation of 5(q, a,
where q and p h 1
< <
w,
i
is
1
Z)
=
{(p 1?
yj, (p 2 y 2 ),
(pm ,
,
yj}
m, are states, a is in 2, Z is a stack symbol, and y is in T*, PDA in state q, with input symbol a and Z the top symbol on
< < i
that the
{
i, enter state p h replace symbol Z by string y„ and advance one symbol. We adopt the convention that the leftmost symbol of y, will be placed highest on the stack and the rightmost symbol lowest on the stack. Note that it is not permissible to choose state p and string y7 for some j =fc i
the stack can, for any the input head
{
in
one move.
The
interpretation of
3{q 9 is
Z)
=
{(p u
(pm , y m )}
y0, {p 2 7il ,
PDA in state q, independent of the input symbol being scanned and Z the top symbol on the stack, can enter state p and replace Z by y, for any
that the
with 1
e,
f
< < /
m. In this case, the input head
is
not advanced.
i,
^
5.2
Example
5.1
DEFINITIONS
|
111
Figure 5.2 gives a formal pushdown automaton that accepts
+ 1)*} by empty stack. Note that for a move in which the PDA symbol on the top of the stack, 3 has a value (q, y) where |y =2. For example, 3{q l9 0, R) = {(q ly BR)}. If y were of length one, the PDA would simply replace the top symbol by a new symbol and not increase the length of the stack. This allows us to let y equal e when we wish to pop the stack. Note that the rule 3(q 2 e, R) = {(q 2 e)} means that the PDA, in state q 2 with {wcw* |w
in (0
writes a
|
,
,
R the top stack
symbol, can erase the R independently of the input symbol. In this case, the input head is not advanced, and in fact, there need not be any remaining input.
M=
0)
{(4..
S(qu
1,
R)
=
{(9.,
GK)}
=
{(9..
SB)}
<5(9i, 1,
B)
=
{(9.,
GB)}
G)
=
{(?.,
BG)}
1,
G)
=
{(9.,
GG)}
c,
R)
=
««2.
K»
c,
B)
=
{(92,
B»
1,
G)
=
{(92,
R)
=
S(qi, 0,
B)
6(q it
0,
%!, 5(q„
<5(
G) =
{(92, G)}
%2,
B) =
{(92,
<5(
q u K,
BR)}
0,
S(qu
Fig. 5.2
(fo, , 92}, {0, 1, c}, {K, B, G}, 6,
0,
U R)
=
0}
%2,
{(92, £)}
Formal pushdown automaton accepting {wcw R w \
in (0
+
1)*}
by empty
stack.
Instantaneous descriptions
To
formally describe the configuration of a
instantaneous description (ID).
PDA
at a given instant
The ID must, of course, record
we
define an
the state and stack
contents. However, we find it useful to include the "unexpended input" as well. Thus we define an ID to be a triple (q, w, y), where q is a state, w a string of input = (Q, Z, T, 3, q 0 Z 0 F) is a PDA, symbols, and y a string of stack symbols. If we say (q, aw, Za) \jf (p, w, Pa) if 3(q, a, Z) contains (p, 0). Note that a may be £ or an input symbol. For example, in the PDA of Fig. 5.2, the fact that (q u BG) is in 3(q u 0, G) tells us that (q u 011, GGR) (— (
M
,
,
1
particular
PDA
M
is
understood.
PUSHDOWN AUTOMATA
112
Accepted languages
PDA
For
M = (Q, L, T,
3,
Z0
q0 ,
F) we define L(M), the language accepted by
,
final state, to be
{w
We
define
|
case,
acceptance
we
Example
y) for
)
some p
in
F and
N(M), the language accepted by empty stack {w
When
Z 0 Pt (p>
w>
(tfo>
|
usually
is
Z0 P
(q 0 , w,
)
-
(p, e, e)
by empty stack, the
for
(or null stack) to be
some p
in Q}.
set of final states
the set of final states be the
let
Figure 5.3 gives a
5.2
1
PDA that
empty
irrelevant, and, in this
is
set.
{ww R w
accepts
y in T*}.
|
in (0
+
Rules
1)*}.
(1)
M to store the input on the stack. In rules (3) and M has a choice of two moves. M may decide that the middle of the input string has been reached and make the second choice: M goes to state q and to match the remaining input symbols with the contents of the stack. and M guessed R the input of the form ww then the inputs will match, M will empty stack and through
(6)
allow
(6),
tries
2
If
is
right,
if
its
,
thus accept the input string.
M= 1) 5( qi , 0,
R)
=
%„
R)
=
3) 5( qi , 0, B)
=
0,G)
=
2)
4) 5(q u 5) 5{qt,
Fig. 5.3
A
1,
q2\
(fai,
{0, 1},
0)
{R, B, G}, 5, q l9 R,
{(«?.,
BR)}
6)
1,G)
=
{(«..
GG), (q 2
{(
GR))
7) d(q 2 , 0, B)
=
{(92,
0}
{(
BB\
1,
G)
=
{(92,
0}
{(«».
BG»
9)
€,
R)
=
{(92,
0}
GB)}
10)
2 , £,
R)
=
{(92, £)}
l,B)~
nondeterministic
(q 2 , €)}
PDA
%„
8) 6(q 2
that accepts
%
,
{ww R w \
in (0 4- 1)*}
£)}
,
by empty
stack.
finite automaton, a nondeterministic PDA any sequence of choices causes to empty its stack. Thus
Like the nondeterministic accepts an input
if
M
always "guesses right," because wrong guesses, input to be rejected.
shows the
PDA
input
accessible ID's of
Deterministic
The
An
of
is
rejected only
if
in themselves,
there
is
no
M M
do not cause an
"right guess." Figure 5.4
M when M processes the string 001100.
PDA's Example
5.1 is deterministic in the
possible from any ID. Formally,
we
say that a
sense that at most one
PDA M =
(Q, I, T, 5, q 0 ,
move
Z0
,
F),
is is
5.2
DEFINITIONS
|
113
Initial
I
(q ly
001 100, R) ->
(q 2y
001 100,
e)
i
(q ly 01100,
BR)
i
(q u
1
100,
BBR)
(£,
1
100,
R)
->
(
,
1
100, e)
i
hi, 100,
GBBR)
I
(q u 00,
GGBBR)
BBK)
(q 2 , 00,
1
!
BGGBBR)
(qi, 0,
fa 2 , 0,
BR)
I
(q u
e,
BBGGBBR)
(ql
e,
GGBBR)
{q 2 ,
c,
R)
- (q 2
,
t,
c)
I
Accept Accessible ID's for the
Fig. 5.4
deterministic 1) for
2) for
in
Q and Z
for all a in
no q
of Fig. 5.3 with input 001100.
if:
each q
empty
PDA
in
Q,
in T,
whenever 3(q
y
£>
Z)
is
nonempty, then
S(q, a,
Z)
is
Z;
Z in
T,
and a
in
Z u
{c}
does
a,
Z) contain more than one
element.
Condition
1
prevents the possibility of a choice between a
move independent
symbol (£-move) and a move involving an input symbol. Condition 2 prevents a choice of move for any (q, a Z) or (q £, Z). Note that unlike the finite automaton, a PDA is assumed to be nondeterministic unless we state otherwise. For finite automata, the deterministic and nondeterministic models were equivalent with respect to the languages accepted. The same is not true for PDA. In fact ww R is accepted by a nondeterministic PDA, but not by any deterministic of the input
y
PDA.
y
PUSHDOWN AUTOMATA
114
PUSHDOWN AUTOMATA AND
5.3
CONTEXT-FREE LANGUAGES
We shall now prove the fundamental result that the class of languages accepted by PDA's is precisely the class of context-free languages. We first show that the PDA's by
languages accepted by
PDA's by empty
We then
stack.
final state are exactly the
show
languages accepted by
that the languages accepted
by empty stack
are exactly the context-free languages.
Equivalence of acceptance by final state and empty stack
Theorem
Proof
5.1
If
L
we would
In brief,
M
stack whenever
its
L(M 2 )
is
2
some
for
like
Mj
to simulate
enters a final state.
q0
Z0
,
M
if ,
2
empties
F) be a
PDA
M where
3' is
Z
M
for
some PDA,
for
M
with the option for
1}
so
Mj
x
to erase
to erase the stack,
x
does not accidentally
M
2
=
(Q, X, T, 5,
Let
).
T u {X 0
q'ol Z,
{q„
X )
}y
q' 0y
5\
X0
,
0),
)}.
Z)
a,
for all
q
in
Q, a in
E or a =
£,
and
in r. all
q
all
ZinTu
Rule
(1)
own bottom Rule
u
L = L(M 2
Z) includes the elements of 3(q,
4) For
and
(Q
)
For
3)
such that
=
2,
stack without entering a final state. Let
X 0 = {(q 0j Z 0 X 0
09 £,
2) 3'(q, a,
x
M
L is N(M
then
defined as follows.
Sy
1)
its
2,
We use state q e of M
I0
and we use a bottom of stack marker accept
M
PDA
(2)
in F,
causes
and
Z
T u {X 0 },
in
{A^}, d'(q e ,
M
x
e,
to enter the initial
X0
of the stack marker
M M
allows
1
to simulate
M
2.
Z) contains
^'(
Z) contains
ID
which Should ,
of
(^ e ,
6).
(q ey e).
M
2,
except that
M
x
will
is
below the symbols of
M
2
M
have
2 's
its
stack.
ever enter a final state, rules (3)
the choice of entering state q e and erasing its stack, thereby accepting the input, or continuing to simulate 2 may 2 One should note that (4)
allow
possibly erase that
M
2
,
x
its
M
entire stack for
some
input
jc
M
.
not
in
L(M 2
Mj
with input x.
By
rule
(1),
(
By
This
M
is
the reason
in simulating has its own special bottom-of-stack marker. Otherwise x would also erase its entire stack, thereby accepting x when it should not. Let x be in L(M 2 ). Then (q 0 x, Z 0 ) |^ (q, c, y) for some q in F. Now consider ,
Mj
).
rule (2), every
move
of
,
M
x,
X 0 )r*M<7o> x, Z 0 X 0
2 is
a legal
(<7o>
x,
Z0
)
move
for
)y
M u thus
(^, £, y).
,
5.3
PUSHDOWN AUTOMATA AND CONTEXT-FREE LANGUAGES
|
115
a PDA can make a sequence of moves from a given ID, it can make the same sequence of moves from any ID obtained from the first by inserting a fixed string
If
of stack symbols below the original stack contents. (q'o,
By
and
rules (3)
X0
x,
(ft, x,
)
Z0 X 0
£
)
Thus
yX 0
(q9 e 9
).
(4),
^V^oJfcSifae.
fa>
4
£>
Therefore,
*o)
fao>
and
Mj
accepts x by empty stack.
Conversely,
Mj
if
M
which
Theorem 2
stack.
and
M
X )
for
some
2,
in
M
PDA
it
is
easy to
show
that the
then a sequence of moves by rule
(1),
Thus x must be
(4).
L is N(M
If
Our
M
2
now
plan
Z 09 0)
where
<5'
1)
Sy
2)
For
PDA
be a
=
2
M
to have
is
enters a final state
M
l9
followed by the erasure of M/s
L(M 2
).
then
L
such that
(Qkj
is
L(M 2
)
some
for
PDA
simulate
M
x
{q'0 ,
L = N(M
q f l I,
T u
t ).
and detect when this occurs. Let
M M
x
x
empties
=
its
(Q, I, T,
Let
{*<>},
<5',
^,
X0
,
{
defined as follows:
is
09 6,
For
2
when and only when
*0 =
{fa 0 ,
z0 ^0
Q
a in
£ u
)
in
all
9
)}.
and
{e} 9
d'(q,
3)
stack,
rule
.
Proof
<5>
5.2
move by
simulates acceptance of x by
x
stack using rules (3)
M
x by empty
accepts
sequence of moves must be one (2) in
4
fae.
all
q in g,
<5'(
M
X0
Z
a Z) 9
in T,
= %,
contains (q f9
)
a Z). 9
e).
M
M
ID of l9 except that 2 will have its which is below the symbols of M/s stack. Rule Should M, ever erase its entire stack, then (2) allows 2 2 to simulate x when simulating will erase its entire stack except the symbol X 0 at the l9 bottom. Rule (3) causes 0 appears, to enter a final state, thereby 2 when the accepting the input x. The proof that L(M 2 ) = N(M X ) is similar to the proof of Rule
(1) causes
2
to enter the initial
own bottom-of-stack marker
M
Theorem
5.1
M
M
and
M
is left
X0
,
,
,
X
as an exercise.
Equivalence of
PDA's and CFL's
Theorem 5.3 L = N(M).
If
L is
M
.
a context-free language, then there exists a
PDA M such that
PUSHDOWN AUTOMATA
116
We
assume that e is not in L(G). The reader may modify the construction where e is in L(G). Let G = (V, T, P, S) be a context-free grammar in Greibach normal form generating L. Let
Proof
for the case
M= where 5(q
a,
y
A) contains
M
PDA
(q y y)
({
q } T, V, 9
A
whenever
q, S,
<5,
-> ay
0), is
in P.
G is in Greibach normal form, each sentential form in a leftmost derivation consists of a string of terminals x followed by a string of variables a. stores the suffix a of the left sentential form on its stack after processing the prefix x. Formally we show that The
simulates leftmost derivations of G. Since
M
S ^>
by a leftmost derivation
xol
we suppose that The basis, = 0, is
S ^>
xa.
pose
i
and
1,
let
x
=
and only
=
x
if (q,
(q, £, a).
(5.1)
and show by induction on that and a = S. For the induction, sup-
c
i
Consider the next-to-last
ya.
x, S)
a)
£,
(q y
trivial since
i
>
p-
(q, x, S)
First
if
step,
bJfl,S)^fe J)h(^,a).
(5.2)
fl
we remove a from the end of the input string in the first ID's of the sequence we discover that (q, y, S) — (q, £, /?), since a can have no effect on the moves of M until it is actually consumed from the input. By the inductive hypothesis S ^> yfl. The move (q, a, P) (q, £, a) implies that P = Ay for some A in K, A -> arj is a production of G and a = 777. Hence If
i
1
(5.2),
\
—
|
5 ^> yP => and we conclude the
portion of
"if"
Now suppose that 5 ±> xol by a that
(q, x,
5) f-^
(q, c, a).
The
yarjy
=
xa,
(5.1).
leftmost derivation.
basis,
i
=
0, is
again
We show by induction on Let
trivial.
i
>
i
and suppose
1
S '=> y^y => yaf/y, where x thus
(q,
=
ya and a
=
rjy.
By
contains
rj).
(q,
and the "only
if"
if (q,
x, S)
Theorem
5.4
If
Let
context-free
M
S)\±{q,a,Ay)\-{q,
£,
a),
portion of (5.1) follows.
conclude the proof,
and only
Proof
y
Thus (q, x,
To
the inductive hypothesis, (q y, 5) j-^ (q, e, Ay) and A -* ar] is a production, it follows that d(q, a, A)
ya, S)\*- (q, a, Ay). Since
L is N(M)
be the
we have only
(q, £, c).
for
PDA
That
some
is
is
in
L(G)
if
and only
=
if
x
e
says S ^> x
is
in
if
N(M).
PDA M, then L is a context-free language.
(Q, Z, T,
grammar where V
to note that (5.1) with a
x
is,
<5,
q Qy
Z0
,
0). Let
G=
the set of objects of the form
(K, I, P. S) be a
[q,
A,
p],
q and p
in
5.3
and
Q,
1)
2)
A
s ->
PUSHDOWN AUTOMATA AND CONTEXT-FREE LANGUAGES
|
in T, plus the
[<7o>
Z0
4] f°r
,
new symbol
each g
in
P
S.
the set of productions
is
Q;
5 i> ^Jfe. #2, ^3] [ eacn a in I u {e} and ,4, B B 2 contains (q l9 B B 2 " Bm \ (If m = 0, then the production is [q A, q ]-> a.) [
^, 4m+
1]
->
'
fl[4i>
'
y
'
1
y
.
.
,
x,
.
y
,
X
To understand
y
the proof
a simulation of the
PDA
know
helps to
it
of G have been defined in such a is
117
M
that the variables
y
x
and productions
way that a leftmost derivation in G of a sentence x when fed the input x. In particular, the variables
appear in any step of a leftmost derivation in G correspond to the symbols on the stack of at a time when has seen as much of the input as the grammar has already generated. Put another way, the intention is that [q A, p] derive x if and only if x causes to erase an A from its stack by some sequence of moves beginning in state q and ending in state p. To show that L(G) = N(M), we prove by induction on the number of steps in that
M
M
y
M
a derivation of
G
[q,
=
1,
Thus
number of moves of
A,p]§>x
if
and only
we show by induction on
First i
or
i
then S(q x, A) must contain y
A, p] ->
[q,
Now
x
suppose
is 1
that
if
(p, e).
M that (q y x,
if
(q y x,
A)
A)
f-^- (p, £,
(Here x
is 6
(p, £,
e\
(5.3)
e\ then [q y A, p] ^> x. If
or a single input symbol.)
a production of G.
>
Let x
1.
(
>i)
\-
=
ay and
(q l9
y,B B 2 " Bn )
(p, £, £).
x
=
" y„, where y has the effect of popping B y x y2 } 7 from the stack, possibly after a long sequence of moves. That is, let y x be the prefix of y at the end of which the stack first becomes as short as n - 1 symbols. Let y 2 be
The
string
y can be written y
'
the symbols of y following y x such that at the end of y 2 the stack first becomes as — 2 symbols, and so on. The arrangement is shown in Fig. 5.5. Note that
short as n
B need x
not be the nth stack symbol from the bottom during the entire time y x is M, since B x may be changed if it is at the top of stack and is replaced
being read by
by one or more symbols. However, none of B 2 B 3 B n are ever at the top while y x is being read and so cannot be changed or influence the computation. In general, Bj remains on the stack unchanged while y x y 2 y7 _ is read. There exist states q 2y q 2 q n+1 where q n + x = p, such that •
•
• •
• •
x
,
,
(qj f
by fewer than as n
—j +
1).
yjt Bj)\+(qj+u
e f e)
moves (qj is the state entered when the stack Thus the inductive hypothesis applies and i
[qp
Bjy
qj+l ]±>yj
for
1<;
first
becomes as short
PUSHDOWN AUTOMATA
118
\
\\ State
=q
x
\
|\ |
State
=
q2
\
\
\
A
State
/Y
=
qn
State
=p
Input symbols consumed
Height of stack as a function of input consumed.
Fig. 5.5
move
Recalling the original [q 9 A,
so
[q,
A,
p]^>ay y 2
e).
The
suppose
basis,
=
i
therefore 3(q, x,
9
|
—
p]=>a[q l9 B, q 2 ][q 2 y„ = x. A, p] ±> x.
,
B B 2 - Bn \ we know
(q i9
y
B2
q3]
,
9
l
~
[q n9
Bn9
q n+l ]
that
y
--'
l
Now
ay A)
(q9
[q,
We show by
induction on
that
i
(q, x,
A)
\±- (p,
£,
immediate, since [q A, p] -+ x must be a production of G and A) must contain (p, £). Note x is £ or in L here. 1, is
9
For the induction, suppose [q9 A,
where q n+l 1
<7 <
m,
p.
Then we may
Bj)
the
in the
first
!, 6,
a legal
£,
£)
move
follows.
e) for 1
xj9 BjBj+
~
q 2}
=
x
•
• •
1
Bn )
step in the derivation of (q, x,
is
write
l9
Bn9
[qn9
ax x x 2
•
•
q n+1
Y^x
9
where
•
x„,
i
steps.
By the
we insert Bj+ above sequence of ID's we see that (qj+
(qj9
From
B
with each derivation taking fewer than
esis, (qj9 Xj,
each stack
=
p]=>a[q l9
A) [—
(q l9
of M, so from this
n. If
\*- (qj+
l9 £,
x from
[q 9
x x2 l
Bj+ A
9
p]
"X B B 2
move and
n9
l
(5.4) for
1
j=
j
•
gJ+ x ] i> x } for inductive hypoth-
Bn at the bottom
•
-B
of
(5.4)
n ).
we know
that
" Bn ) 1, 2,
.
.
. ,
n (q x, A) 9
9
\^
(p,
5.3
PUSHDOWN AUTOMATA AND CONTEXT-FREE LANGUAGES
|
The proof concludes with
=
the observation that (5.3) with q
q 0 and
119
A = Z0
says
Z0
[q 0 ,
p] *>
,
x
if
and only
Z0
(q 0 , x,
if
[±- (p,
)
£, e).
This observation, together with rule (1) of the construction of G, says that
S^>x That
is,
x
Example
and only
if
L(G)
in
is
and only
if
is
{S, [q 0 ,
Z0 =
0,
)
T=
is
in
some
state p.
N(M).
{X,
Z0
9ot
(5,
},
X) =
<5(g 0 , 1,
X) =
CFG G =
X, q 0 ],
[q 0 ,
{(
^Z 0 »,
%x,
{(q 0i
XX%
S(q l9
{(tfi,
Z0
,
0),
X, q x \
{0,
1}.
To construct the may not appear
some
variables
Thus,
we can save some
0}'
[q l9
effort
if
,
1,
X) =
{fa lf
£)},
£,
X) =
{fo lf
£)},
<5( £ >
(V, T, P, S) generating
[q 0
and
x
{0, 1},
({flo,
S(q 09 0,
construct a
^=
if
for
c)
£,
given by
%>,
To
1
)
Let
5.3
M= where £
Z 0 p - (p,
(q 0 , x,
if
X, q 0 \
Z0
,
X
[
q 0 \ [q 0
,
z o) =
N(M)
let
qx\
y
Z0
{(<7i> e)}-
q x \ [q ly
,
Z0
,
q 0 \ [q l9
Z0
,
gj}
productions easily, we must realize that any derivation starting from the symbol S. we start with the productions for 5, then add set of
in
productions only for those variables that appear on the right of some production already in the
set.
The productions
for
S are
S
-> [q 0 ,
Z0
,
q0]
S
->
Z0
,
q Y]
[q 0j
Next we add productions for the variable [q 0
Z0
,
q 0 ] -> 0[q Oy X, q 0 ][q 0
[
z0
,
Qo] ->
%o>
Z0
,
qi ]
q 0 \ These are
,
[
These productions are required by d(q0y tions for [q 0 ,
Z0
,
y
Z0
,
q0]
X, qi][q l9
Z0
,
q0]
0,
Z0 = )
XZ 0
{(q 0 ,
are [
Z0
[
Z0
,
,
,
,
Z0
,
qx]
-> 0[q , 0
X, q x ][q lr
Z0
,
qx]
q x ] -> 0[q 0 X, q 0 ][q 0
qx]
)}.
Next, the produc-
PUSHDOWN AUTOMATA
120
These are also required by S(q 0 0, Z 0 ) = {(q 0 XZ 0 )}. The productions remaining variables and the relevant moves of the PDA are: ,
,
X, q 0 ] X q0 ] X, qj X, q{\
->0[g 0 X, q 0 ][q 0 X, q 0 ] ->0[g 0 X, q x ][q^ X, q 0 ] -> 0[<3f 0 q~] q Q ][q 0 ->0[g 0 X, qi\[q u X, q t ]
2) [q 0 ,
X, q
->
3) fo lf
Z0
1) [q 0 ,
[q 0 , [<7o> [
,
y
,
,
X
X
,
y
y
,
1
qi ] ->
-
4) foi, 5)
,
,
]
t
for the
bi, X, qi ] ->
1, £,
Z0 =
5{q 0 , 0, 5(q 0 ,
^
since
6
X) = X) =
since
since
£
since
1
since
{{q 09
XX)}.
{{q ly c)}. £)}.
)
£, X) = {(^ S(q u 1, X) = {(^
£)}. £)}.
should be noted that there are no productions for the variables [q l9 X, q 0 ] Z 0 q 0 ]. As all productions for [q0 , X, q 0 ] and [q 0y Z 0 q 0 ] have [g l5 X, g 0 ] [q
It
and
l9
[q 0 ,
,
Z0
or [q l9
Z0
,
,
no terminal
g 0 ] on the right, q 0 ] either. Deleting
on either the
right or
left,
Z0
qx]
0[g 0
,
[
qx]
0[(? 0 ,
bo,
4i] ->
Z0
[
,
L
2)
L=N(M
3)
L = L(M 2 )
is
,
the following productions.
qx\
[q l9
A", 4i][
Z0
X, gjfo,, X,
,
Z0
,
<7i]>
[
X,
q x ] -+
£,
4i] ->
^
->
1.
1.
We summarize Theorems below are equivalent: 1)
,
we end up with
[q 0 ,
S
X
string can be derived from [q 0 or y q 0] all productions involving one of these four variables
,
through 5.4 as follows. The three statements
5.1
a context-free language. X
for
)
for
some some
PDA M PDA M 2
.
x
.
EXERCISES 5.1
Construct pushdown automata for each of the languages
5.2
Construct a
PDA
S-+aAA, 5.3
Complete the proof of Theorem
by some 5.4
PDA
Show
even
that
if
in Exercise 4.1.
equivalent to the following grammar.
if e is
in L.
L is a CFL,
[Hint:
A-+aS\bS\a.
5.3
by showing that every CFL L is the set accepted a second state to the PDA for L — {e}.]
Add
then there
is
a
PDA M accepting L by final state such that M
has at most two states and makes no amoves.
EXERCISES
121
* 5.5
a)
Show
that
if
L is a CFL, then L is L(M) for some PDA M such that if S(q, a, X) contains < 2.
(p,y), then |y|
b)
Show
is
symbol c)
either
Y
e
(a
if
S(q
y
X)
a,
YX
(no change to the stack), or
contains for
(p, y),
some stack
push move).
(a
number of states
the
M
of
in part (a)
and
still
PDA for
have a
CFL?
Can we
put a bound on the number of states
5.6
Give a grammar
and 5
is
for the language
M=
(fe 0 ,
<*,}, {0,
in part (b)?
N(M) where 1},
{Z 0 X}, ,
6, ft,
Z0
0
,
)
given by
% The
Z0 =
1,
<5(<7o,
5.7
X
pop move),
Can we put a bound on any
d)
M of part (a) can be further restricted so that
that
then y
)
i,
x) =
{foo,
5( 9 o, 0,
X) =
{fo lf
0,
deterministic
*Z 0
{(<7o,
<5(
)},
z o) =
L *) = X)},
0,
Z0 = )
{(
*
(toi.
{toe Zo)}.
PDA (DPDA) is not equivalent to the nondeterministic PDA. For
example, the language
L={0 n n \n> \
is
a
CFL
that
a)
Show
** b)
Prove
A
5.8
is
not accepted by any
L is a CFL. that L is not accepted by
language
L
is
Show
L
Show
that
that
A
Show
* 5.12
1}
DPDA.
a
DPDA.
if no word in L is a proper prefix of L is N(M) for DPDA M, then L has the prefix property. Is if L is N(M) for a nondeterministic PDA M?
if
L is N(M) for some
two-way
DPDA M if and only if L is L(M') for some DPDA M\
PDA (2PDA) is a PDA
Like the two-way FA,
*S 5.11
\n>
has the prefix property.
5.10
that
2n
said to have the prefix property
the foregoing necessarily true
and
{0"l
that
another word in L.
* 5.9
u
1}
that
L
is
L = {0T2 n
not a
it
accepts by
n
>
|
CFL, by
1} is
that
moving
permitted to
is
off the right
accepted by a
2PDA. We
move
end of shall
either
its
way on
its
input.
input in a final state.
show
in the next
chapter
the way, so 2 PDA's are not equivalent to PDA's.
Write a program to translate a regular expression to a
finite
automaton.
The grammar
E -+ E + E\E
*
E|(E)|id
generates the set of arithmetic expressions with +,
(operator between the operands).
*,
(5.5)
parentheses and id in infix notation
The grammar
P-+ +PP|*PP|id generates the set of arithmetic expressions in prefix notation (operator precedes the operands). Construct a
program
to translate arithmetic expressions from infix to prefix notation
122
PUSHDOWN AUTOMATA
using the following technique. Design a deterministic
according to the grammar
For each vertex
in (5.5).
PDA that parses an infix expression
in the
parse tree determine the necessary
action to produce the desired prefix expression. [Hint: See the solution to Exercise 5.11.] 5.13
Construct a compiler for
infix arithmetic
expressions that produces an assembly
language program to evaluate the expression. Assume the assembly language has the single
LOAD x (copy x to accumulator), ADD x (add x to accumulator), MULT x (multiply contents of the accumulator by x) and STO x (store the contents of the
address instructions:
accumulator
in x).
Solutions to Selected Exercises 5.11
Writing a program to translate a regular expression to a
that finite
automata accepting 0,
6,
0,
and
1
finite
The only problem is parsing the combine the automata.
equivalent to a given regular expression. to determine the order in
Our
first
step
to write a parser
A grammar
is
which to
CFG for
to construct a
and
finally the
automaton can be
We
have already seen (Theorem 2.3) can be combined to obtain an automaton
thought of as constructing a rudimentary compiler.
regular expression
the set of regular expressions.
The
next step
is
automaton-generating routines.
for regular expressions that
ventional precedence of operations
is
groups subexpressions according to the con-
given below. Note that €
is
used for the symbol
e.
E-+P + E\P P-+T- P\T r-o|i|c|0|r*|(E) The parsing routine for each
variable.
A
is
constructed directly from the
STRING
global variable
initially
grammar by
expression.
procedure
FIND_EXPRESSION;
begin
FINDJPRODUCT; while
first
symbol of
STRING
is
+
do
begin delete
first
symbol of STRING;
FINDJPRODUCT end;
end
FIND_EXPRESSION;
procedure
FINDJPRODUCT;
begin
FIND_TERM; while
first
symbol of
STRING
is
do
begin delete
first
symbol of STRING;
FINDJTERM end end
FINDJPRODUCT;
writing a procedure
contains the following regular
»
BIBLIOGRAPHIC NOTES
123
FIND_TERM;
procedure begin
if first
symbol of STRING is 0, 1, e, or first symbol of STRING; first symbol of STRING is ( then
0 then
delete else if
begin first symbol of STRING; FIND_EXPRESSION; if first symbol of STRING is ) then delete first symbol of STRING
delete
else error
end while
first
symbol of STRING is * do first symbol of STRING
delete
end
FIND_TERM
The actual parsing program
consists of a single procedure call:
FIND_EXPRESSION; Note that the recursive procedures FINDJEXPRESSION, FIND_PRODUCT, and have no local variables. Thus they may be implemented by a stack that pushes £, P, or T, respectively, when a procedure is called, and pops the symbol when the procedure returns. (Although FIND.EXPRESSION has two calls to FIND_PRODUCT, both calls return to the same point in FIND_EXPRESSION. Thus the return location
FIND_TERM
need not be stored. Similar comments apply to
FIND_PRODUCT).
PDA
defined.
suffices to execute the
program we have
Thus, a deterministic
Having developed a procedure to parse a regular expression, we now add statements to finite automaton. Each procedure is modified to return a finite automaton. In procedure FIND_TERM, if the input symbol is 0, 1, e, or 0, a finite automaton accepting is created and FIND_TERM returns this automaton. If the input symbol is (, 0, 1, c, or then the finite automaton returned by FINDJEXPRESSION is the value of FIND_TERM. In either case, if the while loop for * is executed, the automaton is modified
output a
0
to accept the closure.
FIND_PRODUCT, the value of FIND_PRODUCT is assigned the of FIND_TERM. Each time the "while" statement is executed, the value of FIND_PRODUCT is set to an automaton accepting the concatenation of the sets accepted by the current value of FIND_PRODUCT and the automaton returned by the call to FIND_TERM in the "while" loop. Similar statements are added to In procedure
value of the
first call
the procedure
FINDJEXPRESSION.
BIBLIOGRAPHIC NOTES The pushdown automaton appears as a formal construction in Oettinger [1961] and SchutIts equivalence to context-free grammars was perceived by Chomsky [1962] and Evey [1963]. A variety of similar devices have been studied. Counter machines have only one pushdown symbol, with the exception of a bottom-of-stack marker. They are discussed in zenberger [1963].
124
PUSHDOWN AUTOMATA
Fischer [1966], and Fischer, Meyer, and Rosenberg [1968]; see also the bibliographic notes to Chapter 7. Pushdown transducers are PDA's that may print symbols at each move. They have been studied by Evey [1963], Fischer [1963], Ginsburg and Rose [1966], Ginsburg and Greibach [1966b], and Lewis and Stearns [1968]. The two-way PDA mentioned in Exercise 5.10 has been studied by Hartmanis, Lewis,
and Stearns [1965]. Its closure properties were considered by Gray, Harrison, and Ibarra [1967], and characterizations of the class of languages accepted by the deterministic (2DPDA) and nondeterministic (2NPDA) varieties have been given by Aho, Hopcroft, and Ullman [1968], and Cook [1971c]. The latter contains the remarkable result that any language accepted by a 2DPDA is recognizable in linear time on a computer. Thus, the existence of a CFL requiring more than linear time to recognize on a computer, would imply that there are CFL's not accepted by 2DPDA's. However, no one to date has proved n n that such a language exists. Incidentally, the language {(yi 2 \n > 1} is an example of a non-CFL accepted by a 2DPDA.
CHAPTER
6
PROPERTIES OF CONTEXT-FREE
LANGUAGES
To a large extent this chapter parallels Chapter 3. We shall first give a pumping lemma for context-free languages and use it to show that certain languages are not context free. We then consider closure properties of CFL's and finally we give algorithms to answer certain questions about CFL's.
6.1
THE PUMPING LEMMA FOR
The pumping lemma
CFL's
for regular sets states that every sufficiently long string in a
regular set contains a short substring that can be
many
copies of the substring as
The pumping lemma
for
CFL's
we
like
The formal statement of
Lemma
6.1
Then there
we may
then 1)
2)
is
\vx\ |
vwx
is,
inserting as
in the regular set.
always two short substrings same number of times, as often as we
states that there are
close together that can be repeated, both the like.
pumped. That
always yields a string
the
pumping lemma
is
as follows.
(The pumping lemma for context-free languages). Let L be any CFL. a constant n, depending only on L, such that if z is in L and \z\ >n, write z = uvwxy such that
>1,
< n, and > 0 wi/wx>
|
3) for all
i
is
in L.
be a Chomsky normal-form grammar generating L — {e}. Observe Ufi) and z is long, then any parse tree for z must contain a long path. More precisely, we show by induction on that if the parse tree of a word
Proof that
if
Let
z
is
G
in
i
125
PROPERTIES OF CONTEXT-FREE LANGUAGES
126
Chomsky normal-form grammar has no path of length greater ~ *. The basis, = 1, is trivial, is of length no greater than 2
generated by a
than
then the word
i,
l
i
must be of the form shown in Fig. 6.1(a). For the induction step, let i > 1. Let the root and its sons be as shown in Fig. 6.1(b). If there are no paths of 1-2 length greater than i — 1 in trees Tx and T2 then the trees generate words of 2 i_1 or fewer symbols. Thus the entire tree generates a word no longer than 2
since the tree
,
s
s
.
B
A
a
A
A (b)
(a)
Fig. 6.1
Parse
trees.
have k variables and let n = 2*. If z is in L(G) and \z\ > w, then since any parse tree for z must have a path of length at least k + 1. But such a path has at least k + 2 vertices, all but the last of which are labeled by variables.
G
Let
|
z
|
>
Thus
2
k~
f
,
must be some variable that appears twice on the path. in fact say more. Some variable must appear twice near the bottom of the path. In particular, let P be a path that is as long or longer than any path in the tree. Then there must be two vertices v x and v 2 on the path satisfying the following there
We can
conditions. 1)
The
3)
is
v2
both have the same
label,
closer to the root than vertex v 2
The portion of
To leaf,
and
vertices v x
2) Vertex v x
see that v x
the path from v x to the leaf
and
v2
is
of length at most k
can always be found, just proceed up path
Of the
keeping track of the labels encountered.
The remaining k
leaf has a terminal label.
say A.
.
+
1
first
vertices
k
+
-h 1.
P
from the
2 vertices, only the
cannot have distinct vari-
able labels.
Now than k
T
the subtree
length at most -h 1,
2*.
since
yield of the subtree
cannot both be the form
£,
is
P was
yield of the subtree
with root v 1 represents the derivation of a subword of
x
This
T T2 1
true because there can be
no path
in
a path of longest length in the entire
T2
.
If
,
then
since the
is
x
of length greater
tree.
the subtree generated by vertex v 2 ,
we can
first
T
Let z x be the
and z 2
is
the
write z x as z 3 z 2 z 4 Furthermore, z 3 and z 4 .
production used in the derivation of z x must be of T2 must be completely
A^BC for some variables B and C. The subtree
within either the subtree generated by is
illustrated in Fig. 6.2.
B or the subtree generated by C. The above
6.1
Zj
THE PUMPING LEMMA FOR CFL'S
|
= bba
= bbbaba
z
(a)
Zj
=
G= Fig. 6.2
(c)
z z z , 3 2 4
where
z
3
({/I, 5, C}, {a, 6},
Illustration of subtrees T,
= bb and
We now know
it
z4
=
x,
follows that
and
and
T2
of
Subtree
A^>z 2
=
e
Lemma T2
6.1.
(a) Tree,
(b) Subtree
T
x
.
.
where
,
l
/I
=> z 3 z 2 z 4 for each
be written as uz 3 z 2 z 4 to complete the proof.
Applications of the
4
that
A=?z 3 Az± clearly
z
{/i-*£C B-+BA. C-+BA, A-*a. B-+b},A)
(c)
But
127
for
i
>
|
z 3 z 2 z4
1
(See Fig. 6.3.)
0.
some u and
y.
We
let
z3
< 2* =
n.
The
string z can
=
z2
y,
=
w,
and
pumping lemma
The pumping lemma can be used to prove a variety of languages not to be context free, using the same "adversary" argument as for the regular set pumping lemma.
Example 6.1 Consider the language L = {aVc i > 1}. Suppose L were context free and let n be the constant of Lemma 6.1. Consider z = a"b"c". Write z = uvwxy so as to satisfy the conditions of the pumping lemma. We must ask ourselves where v and x, the strings that get pumped, could lie in a n b n c". Since vwx < n, it is not possible for vx to contain instances of a's and c's, because the rightmost a is « + 1 positions away from the leftmost c. If v and x consist of cfs only, then wwy (the string wiAvx'y with i = 0) has n b's and n c's but fewer than n a% since vx\ > 1. 1
x
\
\
|
\
PROPERTIES OF CONTEXT-FREE LANGUAGES
128
i
Fig. 6.3
Thus, uwy
is
The derivation of
uv'wx'y,
where u
=
6, v
times
\
=
bb,
not of the form a?b>d. But by the pumping
=
y
=
ba.
lemma vwy
is
in
w=
a,
x
£,
L
l9
a
contradiction.
The
and x consist only of fr's or only of c's are disposed of and b's, then uwy has more c's than a's or b% and again it is vx contains b s and c's, a similar contradiction results. We conclude
cases where u
similarly. If vx has a's
not in Lj. that
L
x
is
y
If
not a context-free language.
The pumping lemma can to Lj are not context free.
{aVd
|
also be used to
Some examples
j
>
and
i}
show
that certain languages similar
are
{aW
\i
Another type of relationship that CFG's cannot enforce
is
illustrated in the next
example.
Example
6.2
Let
be the constant
in
L2 = {aWtfd* > 1 and j > 1}. Suppose L2 is a CFL, and let n Lemma 6.1. Consider the string z = aWcfd' Let z = uvwxy \
i
1
.
vwx
pumping lemma. Then
,
context
free.
as
|
\
6.1
|
THE PUMPING LEMMA FOR CFL's
129
Ogden's lemma
There are certain non-CFL's for which the pumping lemma
is
of
no
help.
For
example,
L3 =
{fl'Vc\i'| either
i
= 0 or ; =
= /}
k
not context free. However, if we choose z = frW, and write z = uvwxy, then it always possible to choose u, y, w, x, and y so that wt; m wx m y is in L 3 for all m. For example, choose invx to have only Vs. If we choose z = cttidd*, then v and x might consist only of a% in which case uvm wxm y is again in L 3 for all m. What we need is a stronger version of the pumping lemma that allows us to is is
on some small number of positions in the string and pump them. Such an is easy for regular sets, as any sequence of n + 1 states of an n-state FA must contain some state twice, and the intervening string can be pumped. The result for CFL's is much harder to obtain but can be shown. Here we state and prove a weak version of what is known as Ogden's lemma. focus
extension
Lemma may
6.2
in fact
(Ogden's lemma). Let L be a CFL. Then there is a constant n (which be the same as for the pumping lemma) such that if z is any word in L,
and we mark any n or more positions of z
= 1)
2)
z "distinguished," then
we can
write
uvwxy, such that: i;
and x together have
vwx has
3) for all
i
at
>
at least
one distinguished position,
most n distinguished
0,
uv*wx?y
is
in
positions,
and
L
Proof Let G be a Chomsky normal-form grammar generating L - {e}. Let G have k variables and choose n = 2* + 1. We must construct a path P in the tree analogous to path P in the proof of the pumping lemma. However, since we worry only about distinguished positions here, we cannot concern ourselves with every vertex along P, but only with branch points, which are vertices both of whose sons have distinguished descendants.
P as
by putting the root on path P. Suppose r is the we end. If r has only one son with distinguished descendants, add that son to P and repeat the process there. If both sons of r have distinguished descendants, call r a branch point and add the son with the Construct
last
vertex placed
larger
number of
process
is
follows. Begin
on
P. If r
is
a
leaf,
distinguished descendants to
P
(break a
tie arbitrarily).
This
illustrated in Fig. 6.4.
It follows that each branch point on P has at least half as many distinguished descendants as the previous branch point. Since there are at least n distinguished
and all of these are descendants of the root, it follows that there are branch points on P. Thus among the last k + 1 branch points are two with the same label. We may select v Y and v 2 to be two of these branch points with the same label and with v x closer to the root than u 2 The proof then proceeds positions in at least k
+
z, 1
.
exactly as for the
pumping lemma.
PROPERTIES OF CONTEXT-FREE LANGUAGES
130
The path
Fig. 6.4
Example
P. Distinguished positions are
L4 =
Let
6.3
|
i
=f
jy j
=fc
k and
x.
fc}.
i
n be the constant in Ogden's
free language. Let n n+n> n+2n>
=
{a'Vc*
marked
Branch points are marked
Suppose
lemma and
b.
L4 were a context-
consider the string
let z = uvwxy Ogden's lemma. If either v or x contains two distinct 2 + + 2 2 2 symbols, then uv wx y is not in L4 (For example, if v is in a b then uv wx y has a b preceding an a.) Now at least one of v and x must contain as since only a s are + + in distinguished positions. Thus, if x is in b* or c*, v must be in a If x is in a then v must be in a*, otherwise a b or c would precede an a. We consider in detail the situation where x is in b*. The other cases are handled similarly. Suppose x is + in b* and v in a Let p = \v\. Then 1 < p < n, so p divides n Let q be the integer
z
a b
c
Let the positions of the as be distinguished and
.
satisfy the conditions of
.
9
y
.
!
.
=
such that pq
n
\
Then z'
is
in
L4
But
.
has (2n!
+
n)
u
2 * +1
a's.
9
=
a 2pq+p
=
=
a
«i;
2 * +1
2n!+p
However, since
.
v
wx 2 * +1 y
- p) as, z' (2n! + n) c's
Since wwy contains exactly (n
and x have no
c's, z'
also has
and hence is not in L 4 a contradiction. A similar contradiction occurs + a or c*. Thus L 4 is not a context-free language. ,
Note
that
Lemma 6.1
is
a special case of Ogden's
lemma
in
which
all
if
x
is
in
positions
are distinguished.
6.2
CLOSURE PROPERTIES OF
CFL's
We now consider some operations that preserve context-free languages. The operations are useful not only in constructing or proving that certain languages are
context
free,
language that
is
but also in proving certain languages not to be context
L can
be shown not to be context free by constructing from
not context free using only operations preserving CFL's.
free. A given L a language
6.2
CLOSURE PROPERTIES OF CFL'S
|
Theorem 6.1 Context-free languages are and Kleene closure. Proof
Let
L and L2
closed under union, concatenation
be CFL's generated by the CFG's
x
=
131
T P SJ
(Vl9
l9
G2 =
and
l9
(V29
T2 P 2 S 2 ,
,
),
we may rename variables at will without changing the language we assume that Vx and V2 are disjoint. Assume also that S 3 S4 and
respectively. Since
generated,
,
,
S 5 are not in Kx or V2 For L x u L2 construct grammar G 3 = (Vx u K2 u {S 3 }, where P 3 is P x u P 2 plus the productions S 3 -* S x S 2 If w .
.
7i is
u r2 P 3 ,
,
S 3 ),
in Lj, then the
\
derivation
S 3 ^>S 1 g>w
63
is
G3
a derivation in
,
as every production of
G
x
is
a
word in L2 has a derivation in G 3 beginning with 5 3 => 5 2 Thus 2 9 L(G 3 ). [For the converse, let w be in L(G 3 ). Then the derivation S 3 =>w begins with either S 3 ^S 1 ^>w or S 3 =>S 2 |>w. In the former case, as Vx and K2 are disjoint, only symbols of G t may appear in the derivation 5 X => w. As the only productions of P 3 that involve only symbols of G l are those from P u we conclude that only productions of P t are used in the derivation S^vv. Thus Sj^w, and w is in Lj. Analogously, if the derivation starts S 3 => 5 2 we may conclude w is in L2 Hence L(G 3 ) ^ Lj u L2 so L(G 3 ) = production of
Similarly, every
.
^ uL
.
.
,
Lj
u L2
,
,
as desired.
G4 =
(Vi u V2 u {S 4 }, 7j u 72 P 4 S 4 ), where P 4 is 54 -*S 1 S 2 A proof that L(Gt) = L(G )L(G 2 ) is similar to the proof for union and is omitted. For closure, let G 5 = (Vx u {S 5 }, Tu P 5 5 5 ), where P 5 is P 1 plus the productions 5 5 -> 5 X 5 5 e. We again leave the proof that L(G 5 ) = L(Gj)* to the reader.
For concatenation,
P u P2 x
let
,
plus the production
,
.
X
,
1
Substitution and
Theorem
6.2
homomorphisms
The
context-free languages are closed under substitution.
L c £* and for each a in Z let La be a CFL. Let L be L(G) La be L(G Without loss of generality assume that the variables of G and the G a 's are disjoint. Construct a grammar G' as follows. The variables of G' are all the variables of G and the G 's; the terminals of G' are the terminals of the G a 's. The start symbol of G' is the start symbol of G. The productions of G' are all the productions of the G 's together with those productions formed by taking a production A -> a of G and substituting 5 a the start symbol of G a for each instance of an a in £ appearing in a. Proof Let L be a CFL, and for each a in £ let
fl
).
fl
fl
,
,
Example
6.4
La = {0T|n >
Let 1}
L
be the
and 1+
=
set
of words with an equal
{ww*
|
w
is
in (0
+
2)*}.
For
number of as and b%
G we may
choose
PROPERTIES OF CONTEXT-FREE LANGUAGES
132
Ga
For
take
S
Gb
For
fl
^0S
fl
l|01
take
S b ->0S b0\2Sb 2\e If
/
= La
the substitution f(a)
is
and f(b)
=
then f(L)
Lj,,
is
generated by the
grammar
S^Sa SSb S\S
b
SS a S\e
Sa
^0Sa l\0l
Sb
^0Sb0\2S b2\e
One should observe that since {a, b), {ab} and a* are CFL's, the closure of CFL's under substitution implies closure under union, concatenation, and *. The union of La and 1+ is simply the substitution of La and into {a, b} and similarly La Lb and L* are the substitutions into {ab} and a*, respectively. Thus Theorem 6.1 y
could be presented as a corollary of Since a
homomorphism
is
Theorem
6.2.
a special type of substitution
we state the
following
corollary.
The CFL's
Corollary
Theorem
The
6.3
homomorphism.
are closed under
context-free languages are closed under inverse
homomor-
phism.
Proof
As with regular
a machine-based proof for closure under inverse
sets,
homomorphism is easiest to understand. Let h £ -* A be a homomorphism and L be a CFL. Let L = L(JVf where JVf is the PDA (Q, A, T, q 0 Z 0 F). In analogy with the finite-automaton construction of Theorem 3.5, we construct PDA M' :
<5,
),
1
accepting h~ (L) as follows.
M on
On
input
a,
M
'
,
,
generates the string h(a) and simulates
were a finite automaton, all it could do on a string h(a) would be to change state, so M' could simulate such a composite move in one of its moves. However, in the PDA case, could pop many symbols on a string, or, since it is nondeterministic, make moves that push an arbitrary number of symbols on the stack. Thus JVf cannot necessarily simulate M's moves on h(a) with one (or any h(a). If JVf
M
finite
number
of)
What we do
moves of is
simulate any £-moves of JVf as
if
its
own.
it
likes
they were JVf s input. As the buffer
allowed to grow arbitrarily long. read an input symbol only of h(a) for in
which it may store h(a). Then JVf may and consume the symbols of h(a) one at a time,
give JVf a buffer, in
a final
some a
state.
when
is
part of JVf 's finite control,
We ensure that the buffer
is
at all times. JVf accepts its input
That
is, JVf
it
cannot be
does not, by permitting M' to empty. Thus the buffer holds a suffix it
w if the buffer is empty and JVf is = {w h(w) is in L}, that is
has accepted h(w). Thus L(M')
|
6.2
Input
CLOSURE PROPERTIES OF CFL'S
|
toM'
Buffer
Control of M'
Control of
M
Stack of
L(M')
=
PDA
Construction of a
Fig. 6.5
133
h~ *(L(M )). The arrangement
is
M and M' l
accepting h~ (L).
depicted in Fig: 6.5; the formal construc-
tion follows.
Let
M' =
such that q <5'
(Q\ Z, T, S\ [q 0 e], Z 0 F x {e}), where Q' consists of pairs [q, x] Q and x is a (not necessarily proper) suffix of some h(a) for a in E. ,
,
in
is
defined as follows:
is
1) d'([q, x],
7)
£,
contains
ulate e-moves of 2) S'([q, ax],
ulate 3) d'([q,
e,
Y) contains
moves of e], a,
all ([p, x], y)
such that
S(q,
e,
7) contains
M independent of the buffer contents. all ([p, x], y)
such that d(q,
a,
Y) contains
(p, y).
Sim-
(p, y).
Sim-
M on input a in A, removing a from the front of the buffer.
Y) contains
Y) for
([q, h(a)],
M"s
with h(a), reading a from
all
a in
I and Y in
T.
Load
input; the state and stack of
the buffer
M
remain
unchanged.
To show (3),
that L(M')
=
h~ I (L(M))
(fa. e
nius
if
M accepts
l a a ) kr >
fr(w),
that
some p
in
F and
/?
in
r*
it
([tf0 ,
so
observe that by one application of rule
M'
accepts w.
Thus L(M')
Conversely, suppose
M'
(fa>
H<*)1
(2), if (q, h(a),
a ) \w
ilPf
4C
a) f^-
(p, e, ft),
then
Ju-
is,
( g0f
for
first
followed by applications of rules (1) and
^Z
0 )g-(p,
follows that
4 w, Z0 )|^([p,
2
6, fi)
^
_1
0),
(L(M)).
w = a l a 1 - an Then since rule (3) can be component of M"s state) empty, the sequence
accepts
applied only with the buffer (second
£], e,
.
PROPERTIES OF CONTEXT-FREE LANGUAGES
134
of the moves of
M'
([
leading to acceptance can be written
4
a„>
<*i<*2
Zo)Isf([Pi»
is
in F.
The
transitions
\kf(\jPi,
\w(\j>2,
4 ^2^3
([P2, M«2)],
from state [p h e] to (2). Thus, (q 0 e,
other transitions are by rules (1) and (p is
L{M') =
is
in
£>
n , «i)>
a n ,<*i\ « 2 )> « 2 ),
a 3 fl4
ft^)] are by rule
Z 0 ) \£f
(p l
£,
o^),
and
(3),
the
for all
i,
a f+i)-
we have
"(q 0 , h(a x a 2 so h(a x a 2 "' a n )
,
M0»)>a.)hr(Pi+i>
Putting these moves together,
<*
h(ai)l a 2
hr
where pn
a i a 2 "'
el
an \
Z0
L(M). Hence L(M')
\%f (Pn,
)
^
_1 /i
U «„+ i)>
(L(M)), whereupon
we conclude
"
1
h- (L(M)).
Boolean operations
There are several closure properties of regular sets that are not possessed by the context-free languages. Notable among these are closure under intersection and complementation.
Theorem Proof
CFL. j
>
6.4
In
We
The CFL's
Example
are not closed under intersection.
we showed the language L = {ctb (?\i > 1} was not a L2 = {aVd\i > 1 and ; > 1} and L3 = {flW|i > 1 and l
6.1
claim that
Y
For example, a PDA to recognize L2 stores the as on its them against b% then accepts its input after seeing one or more Alternatively L 2 is generated by the grammar 1}
are both CFL's.
stack and cancels c's.
S-+AB A
-+
aAb ab |
B-+cB\c
6.2
where
A
generates
db and B x
CLOSURE PROPERTIES OF CFL's
|
generates d.
A
similar
135
grammar
S-+CD C -> aC\a D-+bDc\bc generates
L3
.
However, L 2 n L 3 = L v If the CFL's were closed under would thus be a CFL, contradicting Example 6.1.
The CFL's
Corollary
intersection,
L
x
are not closed under complementation.
Proof We know the CFL's are closed under union. If they were closed under complementation, they would, by DeMorgan's law, L x n L2 = Li u L 2 be closed under intersection, contradicting Theorem 6.4.
Although the
class of
CFL's
is
not closed under intersection
closed under
it is
intersection with a regular set.
Theorem Proof for
6.5
Let
If
L
is
a
CFL
and
R
a regular
is
Ln R
a
is
<5
CFL.
simulates that if
),
>
,
,
M and A in parallel," as shown in Fig. and only
then
L be L(M) for PDA M = (Q My S, T, M g 0 Zo> FM and let R be L(/l) (Q^, Z, ^, p 0 F^). We construct a PDA M' for L n R by "running
DFA A =
without changing the state of A.
if
set,
6.6.
M'
simulates
When M' makes
a
move and also simulates A's change accept. Formally, let A and
moves of
move on
of state on input
M
both
M' = (Q A x Q M
,
Z, T,
Input to
[p 0 , ^ 0 ],
(5,
/I. A/,
and
Z 0 F^ ,
x
FM
/V/'
Z
Control of Ml
Control of/1
Control of
M
Stack of
M and Fig. 6.6
Running an
FA
and a
PDA
M on input M
e
input symbol
/V/'
in parallel.
),
a.
M'
a,
accepts
PROPERTIES OF CONTEXT-FREE LANGUAGES
136
where 3 is defined by S([p, q], a, X), contains and S M (q, a, X) contains (q, y). Note that a An easy induction on i shows that
Z0
(bo, qol w, if
and only
basis,
tion,
i
=
,
w,
Z0
and a
xa,
the definition of
=
P and
<5,
= p0 g = g 0 — 1, and let
for
Z0
i
,
=
and
P'
(
converse, showing that (q 0 w, ,
similar
Use of
t
and
left
as
an
= Z0
y
^
By
,
w)
=
p',
([p,
4
p.
=
and w
£, y),
the inductive hypothesis,
Z0
)
|-^-
Thus 5 A (p 0 w)
Z0
,
)
[£-
(
Z 0 ) \jf (q, Z0
)
For the induc-
£.
(
([p', g'], a, /?) (^r ([p, g], £,
y).
w,
,
(q 0 , x,
the fact that
(q\ a, ft)
(bo, 4o], w, is
and only if S A (p a) = which case p' = p.
y)
<5(p 0 ,
([p', 4'], a,
f-^f-
)
(q 0 ,
The
£,
£,
and
or a symbol of Z.
is £
*)
By
in
|£r ([p, g],
y)
since p
0, is trivial,
([Po» 4o]> *<*,
w=
£,
hjr (g,
)
assume the statement
where
if
be
if
(
The
)
([p\ q'\ y)
may
^ t,
([p,
=
y) tells us that
p and
y)-
and S A (p 0 w)
y)
,
4
=
p imply
£, y),
exercise.
closure properties
We conclude this section with an example illustrating the use of closure properties of context-free languages to prove that certain languages are not context
free.
Example 6.5 Let L = {ww w is in (a + b)*}. That is, L consists of all words whose first and last halves are the same. Suppose L were context free. Then by Theorem 6.5, L = L n a + b + a + b + would also be a CFL. But L = {a bia bi \i> Uj> 1}. L is almost the same as the language proved not to be context free in Example 6.2, using the pumping lemma. The same argument shows |
x
x
i
i
1
is not a CFL. We thus contradict the assumption that L is a CFL. we did not want to use the pumping lemma on L u we could reduce it to L2 = {a bic di > 1 and j > 1}, the exact language discussed in Example 6.2. Let h be the homomorphism h(a) — h(c) = a and h(b) = h(d) — b. Then h~ (L ) consists of all words of the form x x 2 x 3 x 4 where Xj and x 3 are of the same length and in + and x 2 and x 4 are of equal length and in (b -I- d) + Then h' ^) n (a + c) a*b*c*d* = L 2 By Theorems 6.3 and 6.5, if L x were a CFL, so would be L2 Since L2 is known not to be a CFL, we conclude that L is not a CFL.
that
L
x
If
i
i
|
i
1
l
1
,
1
.
,
.
.
x
w 6.3
DECISION ALGORITHMS FOR CFL'S
|
137
DECISION ALGORITHMS FOR CFL's
6.3
There are a number of questions about CFL's we can answer. These include whether a given CFL is empty, finite, or infinite and whether a given word is in a given CFL. There are, however, certain questions about CFL's that no algorithm can answer. These include whether two CFG's are equivalent, whether a CFL is cofinite, whether the complement of a given CFL is also a CFL, and whether a given CFG is ambiguous. In the next two chapters we shall develop tools for showing that no algorithm to do a particular job exists. In Chapter 8 we shall actually prove that the above questions and others have no algorithms. In this
we
chapter
content ourselves with giving algorithms for
shall
some of
the
questions that have algorithms.
As with regular sets, we have several representations for CFL's, namely grammars and pushdown automata accepting by empty stack or by final state. As the constructions of Chapter 5 are all effective, an algorithm that uses one representation can be made to work for any of the others. We shall use context-free
CFG
the
Theorem or
representation in this section.
There are algorithms to determine
6.6
if
a
CFL is
(a)
empty, (b)
finite,
(c) infinite.
Proof The theorem can be proved by the same technique (Theorem 3.7) as the analogous result for regular sets, by making use of the pumping lemma. However,
we have
already given a
CFG G =
(K, 7, P, S), the
the resulting algorithms are highly inefficient. Actually,
of
test
Clearly,
Lemma L(G)
is
determines
4.1
CFL is empty.
whether a
better algorithm to test
nonempty
if
For a
a variable generates any string of terminals.
if
and only
if
the start symbol
S generates some string of
terminals.
To
test
whether L(G)
is finite,
use the algorithm of Theorem 4.5 to find a
CFG
and with no useless symbols, generating Ufi) - {e}. L(G') is finite if and only if L(G) is finite. A simple test for finiteness of a CNF grammar with no useless symbols is to draw a directed graph with a vertex for each variable and an edge from A to B if there is a production of the form A -> BC or A -> CB for any C. Then the language generated is finite if and only if this graph has no cycles. G'
=
(V, T, F, S)
If there is
in
CNF
A0 A u
a cycle, say
An A0
,
,
,
then
A 0 =>oc A p =xx 2 A 2 P 2 " =>a n A n Pn =>ocn+l A o pn+u -
1
where the
a's
and
i
i
/Ts are strings of variables,
with
a, |
useless symbols, a n+
1
length at least n
+
useless symbols,
we can
1.
|
=
i.
Since there are
no
1
find terminal strings
terminal string v such that
S
/?,
w and /?„+ ^> x for some terminal strings w and x of total Since n > 0, w and x cannot both be e. Next, as there are no ^>
A 0 ^> v. Then
yA 0 z ^> ywA 0 xz ^> y
2
y and
for all
z such that
S ^> yA 0 z, and a
i,
A 0 x 2 z ^>--£> yw A 0 x z ^> j/w Wz. l
l
PROPERTIES OF CONTEXT-FREE LANGUAGES
138
As an
|
wx > |
yw vx z cannot equal l
0,
l
ytfvx'z
if
i
j.
=fc
Thus
the
grammar
generates
number of strings.
infinite
Conversely, suppose the graph has
no
cycles.
Define the rank of a variable
to be the length of the longest path in the graph beginning at A.
A
The absence of
A is finite. We also observe that if A -+ BC is a B and C must be strictly less than the rank of A, from B or C, there is a path of length one greater from A. on r that if A has rank r, then no terminal string derived
cycles implies that the rank of
production, then the rank of
because for every path
We
show by induction
A
from Basis
has length greater than 2 r
—
0. If
A
has rank
0,
r .
then
its
vertex has
/l-productions have terminals on the right, and Induction
r
>
string of length less,
0. If 1.
A
no edges
out. Therefore all
derives only strings of length
we use a production of the form A -» a, we may derive only a we begin with A -* BC, then as B and C are of rank r — 1 or
If
r by the inductive hypothesis, they derive only strings of length 2
Thus
BC
Since variables,
1.
cannot derive a string of length greater than 2
~
1
or
less.
r .
S is of finite rank r0 and in fact, is of rank no greater than the number of S derives strings of length no greater than 2 r °. Thus the language is ,
finite.
Example
6.6
Consider the grammar
S^AB A-+BC\a B->CC\b
C^a whose graph is shown in Fig. 6.7(a). This graph has no cycles. The ranks of S, A, B, and C are 3, 2, 1, and 0, respectively. For example, the longest path from S is S, A, 3 B, C. Thus this grammar derives no string of length greater than 2 = 8 and therefore generates a finite language. In fact, a longest string generated from S is
S =>
AB => BCB => CCCB => CCCCC ^> aaaaa.
(b)
(a)
Fig. 6.7
Graphs corresponding
to
CNF
grammars.
.
6.3
|
DECISION ALGORITHMS FOR CFL'S
139
If we add production C -> AB, we get the graph of Fig. 6.7(b). This new graph has several cycles, such as A, B, C, A. Thus we can find a derivation A i>
A=>BC=> CCC => CABC, where a and BC^>ba, we have /l^a/lta. Then as S£>/lb and A^>a we now have particular
y
S ^> a a(ba) b l
l
for every
i.
Thus
the language
is infinite.
Membership Another question we may answer is: Given a CFG G = (V,T P S) and string x in T*, is x in L(G)? A simple but inefficient algorithm to do so is to convert G to G' = (V, T, F, S), a grammar in Greibach normal form generating L(G) - {e}. Since the algorithm of Theorem 4.3 tests whether S^>e, we need not concern ourselves with the case x = e. Thus assume x e, so x is in L(G') if and only if x is in L(G). Now, as every production of a GNF grammar adds exactly one terminal to the string being generated, we know that if x has a derivation in G', it has one with exactly x steps. If no variable of G' has more than k productions, then there |x| are at most /c leftmost derivations of strings of length x We may try them all y
|
y
|
|
|
.
systematically.
However, the above algorithm can take time which is exponential in x There are several algorithms known that take time proportional to the cube of |x| or even a little less. The bibliographic notes discuss some of these. We shall |
|
known as the Cocke- Younger-Kasami on the dynamic programming technique discussed in the solution to Exercise 3.23. Given x of length n > 1, and a grammar G, which we may assume is in Chomsky normal form, determine for each i and j and for each variable A, whether A ^> x ip where x tj is the substring of x of length j here present a simple cubic time algorithm or
CYK algorithm. It is based
beginning at position
We
production, since x i}
>
i.
For j
=
a string of length
1.
proceed by induction on is
j.
A^> x,7 if and only if A -* x 0 is a Proceeding to higher values of j, if
1,
-
then A ^> x tj if and only if there is some production A -> BC and some /c, x ik and C ^> x i+kJ - k Since k and j - k are both less
j
1
1,
.
than
j,
we
already
know whether each
A ==> x
thus determine whether
-.
fi
of the
Finally,
last
two derivations
when we reach j
S^>x ln But x ln = x, so x is in Ufi) if and only To state the CYK algorithm precisely, let be the that A ^> Xij. Note that we may assume 1 < < n — j + 1,
whether
n,
if
S^>x ln
.
i
length greater than n
— + i
1
beginning at position
i.
exists.
=
We may
we may determine .
set of variables
for there
Then
is
no
A
such
string of
Fig. 6.8 gives the
CYK
algorithm formally.
G
Steps (1) and (2) handle the case j = 1. As the grammar is fixed, step (2) amount of time. Thus steps (1) and (2) take 0(n) time. The nested 2 for-loops of lines (3) and (4) cause steps (5) through (7) to be executed at most n takes a constant
times, since
i
and j range
in their respective for-loops
between
limits that are at
PROPERTIES OF CONTEXT-FREE LANGUAGES
140
begin for i:=
1)
1
do
to n
Vn :={A\A-*a
2)
for
3)
j:= 2
for /:=
4)
—j +
to n
1
is2i
production and the
ith
symbol of x
is
a};
do
to n
"
do
1
begin 5)
Ky:-0;
6)
for
k:= V^:=
7)
is
toj
1
Vij
do
1
u {A\A-+ BC
is
a production,
B
is
in
and
C
in
end end
The
Fig. 6.8.
most n
CYK
algorithm.
apart. Step (5) takes constant time at each execution, so the aggregate time 2
spent at step (5) is 0(« ). The for-loop of line (6) causes step (7) to be executed n or fewer times. Since step (7) takes constant time, steps (6) and (7) together take 0(«) 2 3 time. As they are executed 0(« ) times, the total time spent in step (7) is 0(n ). Thus the entire algorithm
Example
6.7
is
0(n
3 ).
CFG
Consider the
S-+AB\BC A
->
BA
|
a
B^CC\b C-+AB\a and the input string baaba. The table of J^/s is shown in Fig. 6.9. The top row is filled in by steps (1) and (2) of the algorithm in Fig. 6.8. That is, for positions 1 and = K41 = {B}, since B is the only variable which derives b. 4, which are b, we set Vx x
b
a
S.
A.
C
B
A
A.
4
S.
C C
5
B S,
A
i i
3
0
4
0
5
S,
A,
B
B S.
A.
C
C Fig. 6.9
a
-*
3
2
1
B
1
b
a i
Table of Vjs.
A.
C
EXERCISES
V2l = V31 = V51 =
Similarly,
on the
A
{A, C}, since only
and
141
C have productions with
a
right.
To compute Vi} for j > 1, we must execute the for-loop of steps (6) and (7). We must match Vik against Vi+kJ _ k for k = 1, 2, - 1, seeking variable D in Vik and suc h that DE is the right side of one or more productions. The left £ in .
. .
sides of these productions are adjoined to
V
Vi+kJ _ k
and taneously move down column right, as shown in Fig. 6.10. corresponds to visiting
ik
Fig. 6.10
For example, are filled
We
in.
and
C
and to K24 right side, so
.
Next
^si
and
S, respectively.
{B}{A C}
member
1
in turn is to simul-
Traversal pattern for computation of
VtJ
a)
K15
of
Show
c) {a
,
are
9
,
the corresponding
=
BC}.
BA is
side
A
to
K24
.
Finally,
2 i
in
in the
,
y
d) the set of strings of e) {a"b n c m * 6.2
a)
b) c)
|
n
Which of the
2n}
and
a's, ^'s,
c's
following are CFL's?
|
d)
{ftj
\bi is
with an equal number of each
-
{a'Vli^yand i + 2j} n n (a + b)* ~{(a b Y\n> 1} {wh^w w is in (a 4- b)*} i
in binary,
i
>
1}
consider
y
language generated by the grammar.
a prime}"
\i is
we
£C are each right sides, with left sides A K24 so we have K24 = {S> A C}. Since S is
and
These are already
the string baaba
left
that the following are not context-free languages.
=
l
to the
K24 assuming that the top three rows of Fig. 6.9 K21 = {A, C} and V33 = {£}. The possible
{flW|i
b) {a i bj \j
V}
.
EXERCISES 6.1
V
AB and CB. Only the first of these is actually a right - AB and C -» AB. Hence we add S we consider K22 K42 = {B}{S A} = {BS, BA}. Only BA is a
V2l K33
we add
v23 a
pattern in the table which
1, 2,
a right side of two productions S
it is
=
.
=
begin by looking at
right-hand sides in side,
Vu The
and up the diagonal extending from
i
us compute
let
for k
1
PROPERTIES OF CONTEXT-FREE LANGUAGES
142
e)
{wxw
f) (a
63 a)
+
u> and x are in (a + b)*-{(fl^)"|«> 1}
b)*}
|
Prove that the following are not CFL's.
{aW|y = max
{i,
k}}
b) [tfir
[Hint:
Use Ogden's lemma on a
Show
6.4
that the
* c)
form a n bn c"'.]
CFL's are closed under the following operations:
* a) Quotient with a regular b)
string of the
set,
that
is, if
L is a CFL and R a regular set, then L/R is a CFL.
INIT
CYCLE
d) reversal
See Exercise 3.4 for the definitions of
Show
6.5 $
a)
that the
MIN
b)
INIT and CYCLE.
CFL's are not closed under the following operations.
MAX
c)
\
d) Inverse substitution e)
INV, where INV(L)
MIN, MAX, and Let
6.6
hi(a)
= a,
X
=
{x \x
is
in L}
i are defined in Exercises 3.4 and 3.16.
be an alphabet. Define
hi(a)
and wy R z
— wyz
=
£,
h 2 (a)
=
homomorphisms
and h 2 (a)
£,
=
h,
h u and h 2 by h(a)
a for each a in £*. For
=
h(a)
L £ Z* and L 2 £ x
=
a,
X*,
define Shuffle (Lj,
That
is,
L2) =
{x|for
the Shuffle of L and x
a word of
L 2 Symbols .
L
some y
2 is
h~ l (x\ h
in
the set of
x
(y)
is
in
words formed by
L and x
"shuffling" a
from the two words need not alternate as
a)
Show
b)
Prove that the Shuffle of two CFL's is not necessarily a CFL. Prove that the Shuffle of a CFL and a regular set is a CFL.
c)
that the Shuffle of
two regular
sets
is
h 2 (y)
in
is
in
L2
}.
word of L with x
a "perfect shuffle."
regular.
A Dyck Language is a language with k types of balanced parentheses. Formally, each Dyck language is, for some k L(G k ), where G k is the grammar
6.7
y
S
— SS|[
I
S] I
|[ 2
S] 2 |--|[k S]
fc
|6.
Dyck language with two kinds of parentheses. Prove [lbfiLbh^L is that every CFL L is h(L D n R), where h is a homomorphism, R a regular set, and L D a Dyck language. [Hint Let L be accepted by empty stack by a PDA in the normal form of Exercise 5.5(b) where the moves only push or pop single symbols. Let the parentheses be For example,
in the
:
where "means" on input a, stack symbol X is pushed, and matching "means" on input b, X may be popped (a or b may be e). Then the Dyck language enforces the condition that the stack be handled consistently, i.e., if X is pushed, then it will still be X when it is popped. Let the regular set R enforce the condition that there be a sequence of states for which the push and pop moves are legal for inputs a and b, [ abX
and
] abXj
parenthesis
respectively. Let h([abx ) 6.8
the
Show
=
a and h(] abx )
=
b.]
L is a CFL over a one-symbol alphabet, then L is regular. [Hint: Let n be for L and let L ^ 0*. Show that for every word of length n or p+iq > 0. Then there are p and q no greater than n such that 0 is in L for all
that
if
pumping lemma constant
more, say
0"*,
i
EXERCISES
show
L
that
linear sets,
sets of the
6.9
Prove that the
6.10
Show
b)
then there
=
set of
p+iq
{0
primes
i \
in
>
length less than n plus a finite
0} for fixed p
binary
homomorphism
is
and q q y
is
a constant
uvwxy, such that
6.12
Show
6.13
A PDA
L
such that
uvxy
< n,
\
z in
if
>
\vx\
that is
{aWcW said to
>
1! 1
1
make
and j
a turn
>
for linear languages. If
L is
1,
|
y2
1
strictly greater
stack "peaks."
for all
M
A PDA
a)
b) c)
/c,
it is
Show Show Show
is
,
y2 )
|—
i
>
0,
|
|
linear
L
if
w>Wy
,
is
we may
if
when the length of the word w in L(M), w is If a PDA is /c-turn for some
for every
k turns.
accepted by a finite-turn
and only
if it is
PDA, L
is
metalinear.
accepted by a one-turn
that the linear languages are closed under inverse
PDA.
homomorphism.
that the metalinear languages are closed under union, concatenation,
Show
that the set of strings with
write
in L.
y3 )
phism, inverse homomorphism, and intersection with a regular 6.14
a linear language,
a turn occurs
is,
no more than
is
w3
($ 3 ,
PDA
said to be finite-turn. If
that a language
w2 .
1
|
is
is
enters a sequence of ID's
(q 2 ,
accepted by a sequence of ID's making finite
of
not a linear language.
That y 1 and y 3 said to be a k-turn
than
L
of length n or greater, then
and
1} is
if it
(Qu ™i, y t ) |— is
number
n.]
not a CFL.
|
and
<
c) intersection with a regular set
Prove the following pumping lemma
6.11
z
form
that the linear languages (see Exercise 4.20 for a definition) are closed under
union
a)
some words of
consists of perhaps
i.e.,
143
an equal number of as and
homomor-
set.
b's is
a
CFL that is not a
metalinear language.
Show
6.15
that
** b) the metalinear languages
a) the linear languages
are not closed under
Give an algorithm to decide
6.16
Use the
6.17 a)
aaaaa
CYK
grammar
6.18
G
Let
for
two
sentential forms a
and p of a
CFG
G, whether
algorithm to determine whether
b) aaaaaa
are in the
a)
*.
of
Example
6.7.
be a context-free grammar
in
CNF.
Give an algorithm to determine the number of
distinct derivations of a string x.
b) Associate a cost with each production of G. Give an algorithm to produce a cost parse of a string x.
The
cost of a parse
is
the
sum
minimum-
of the costs of the productions
used.
[Hint : Modify the
CYK
algorithm of Section
6.3.]
Solutions to Selected Exercises 6.4
L(G) path
Let G = (K, T, P, S) be a CFG in CNF. To construct G such that = CYCLE(L(G)) consider a derivation tree of a string x x 2 in grammar G. Follow the from S to the leftmost symbol of x 2 We wish to generate the path in reverse order
c)
x
.
PROPERTIES OF CONTEXT-FREE LANGUAGES
144
(bottom to top) and output symbols on opposite sides of the path from which they orig-
To do
inally appeared.
this construct
G= where
P
(V
u {A\A
is
{S 0 }, T, P,
S 0 ),
contains
4)
C -+ AB and B^CA'xiP contains A S^£, ^ S 0 -+ aA if P contains A-*a
5)
S 0 -*S.
3)
u
V}
productions of P,
1) all
2)
in
BC
y
y
To
= CYCLE(L(G)) show
see that L(G)
A ^> A A 2
"'
!
An
if
and only
if
each
for
by induction on the length of a derivation that
i
Ai$>A i+l •• ^ n ^/li
•••
Then
5^^!
••
A n => A
t
'Ai
l
aA
+
i
-
l
"An
iff
^ aA
S 0 => aAi
=>a>4 i+1
A
G
derivation tree of
shown
is
t
+
••• x
/InS/^
••• /t v4! ••• B
in Fig. 6.11(a)
•
Ai-i.
with a corresponding tree for
G
in Fig.
6.11(b).
6.5
a) Let
L
CFL
be the
{0*F2
A-+0A\e
S-+AB\C,
<
fc
i |
or ;
<
y
L
k}.
B^\B2\B2\e
y
is
generated by the
CFG
C-0C2|C2|D,
D-lD|t
{(yp'2* = min(i, j)}. We claim MIN(L) is not a CFL. Suppose it were, and let contains no 2's, then pumping lemma constant. Consider z = 0 n l n 2 n = uvwxy. If uwy is not in MIN(L). If vx has a 2, it cannot have a 0, since vwx < n. Thus uv 2 wx 2 y has at least n + 1 2's, at least n Vs and exactly n 0's; it is thus not in MIN(L).
MIN(L) =
|
/c
n be the
\
|
A S
/!
SQ
/V
£
3
/\D C /\F \ E
A
/
G
1
/
G
./\
C
4
//
3
(a)
Fig. 6.11
F
\
2
//
./\
/ 0
4
\
E
A.B \ D / -AA
2
1
S I
I
e
0 (b)
Tree transformation used for Exercise
6.4(c).
BIBLIOGRAPHIC NOTES
145
BIBLIOGRAPHIC NOTES The pumping lemma for context-free languages is from Bar-Hillel, Perles, and Shamir [1961]; Ogden 's lemma, in its stronger version, is found in Ogden [1968]. Wise [1976] gives a necessary and sufficient condition for a language to be context free. Parikh [1966] gives necessary conditions in terms of the distribution of symbols in words of the language.
Pumping lemmas
for other classes of languages are given in
Boasson [1973] and Ogden
[1969].
Theorem
6.2,
closure under substitution, and
Theorem
6.5,
closure under intersection
Theorem 6.3 is from Ginsburg and Rose [1963b]. Theorem 6.4 and its corollary, nonclosure under intersection or complementation, are from Scheinberg [I960]. Theorem 6.6, the existence of an algorithm to tell whether a CFL is finite, is also from Bar-Hillel, Perles, and Shamir [1961]. Floyd [1962b] shows how to apply closure properties to prove language constructs not to with a regular
be context
set,
are from Bar-Hillel, Perles, and Shamir [1961].
free.
CYK algorithm was originally discovered by J. Cocke, but
its first publication was due independently to Kasami [1965] and Younger [1967], The most practical, general, 3 context-free recognition and parsing algorithm is by Earley [1970]. This algorithm is 0(n ) 2 in general, but takes only 0(n ) on any unambiguous CFG and is actually linear on a wide variety of useful grammars. The algorithm of Valiant [1975a] is asymptotically the most 2 B efficient, taking 0(n steps, while the algorithm of Graham, Harrison, and Ruzzo [1976] ) takes 0(n 3 /log n) steps. A related result, that membership for unambiguous CFG's can be
The
tested in 0(n
2 )
time,
is
due to Kasami and Torii [1969] and Earley
[1970].
set, was shown by Ginsburg and Spanier [1963]. Additional closure properties of CFL's can be found in Ginsburg and Rose [1963b, 1966]. Exercise 6.7, the characterization of CFL's by Dyck languages, is from Chomsky [1962]. Stanley [1965] showed the stronger result that the Dyck language used need depend only on the size of the terminal alphabet. The proof that the primes in binary are not a CFL (Exercise 6.9) is from Hartmanis and Shank [1968]. Finite-turn PDA's, mentioned in Exercise 6. 1 3, were studied by Ginsburg and Spanier [ 1 966]. Exercise 6.8, that CFL's over a one-symbol alphabet are regular, was shown by Ginsburg and Rice [1962].
Exercise 6.4(a), closure of CFL's under quotient with a regular
CHAPTER
7
TURING MACHINES
In this chapter
we introduce
the Turing machine, a simple mathematical
model of
Turing machine models the computing capability of a general-purpose computer. The Turing machine is studied both for a computer. Despite
its
simplicity, the
defines (called the recursively enumerable sets)
and the computes (called the partial recursive functions). A variety of other models of computation are introduced and shown to be equivalent to the Turing machine in computing power. the class of languages
it
class of integer functions
7.1
INTRODUCTION
The
intuitive notion of
it
an algorithm or effective procedure has arisen several Chapter 3 we exhibited an effective procedure to determine if the set accepted by a finite automation was empty, finite, or infinite. One might naively assume that for any class of languages with finite descriptions, there exists an effective procedure for answering such questions. However, this is not the case. For example, there is no algorithm to tell whether the complement of a CFL is empty (although we can tell whether the CFL itself is empty). Note that we are not asking for a procedure that answers the question for a specific context-free language, but rather a single procedure that will correctly answer the question for all CFL's. It is clear that if we need only determine whether one specific CFL has an empty complement, then an algorithm to answer the question exists. That is, there is one algorithm that says "yes" and another that says "no," independent of their inputs. One of these must be correct. Of course, which of the two algorithms answers the question correctly may not be obvious. times. In
146
7.2
|
THE TURING MACHINE MODEL
147
At the turn of the century, the mathematician David Hilbert set out on a program to find an algorithm for determining the truth or falsity of any mathematical proposition. In particular, he was looking for a procedure to determine if an arbitrary formula in the first-order predicate calculus, applied to integers, was true. Since the first-order predicate calculus is powerful enough to express the statement that the language generated by a context-free grammar is £*, had Hilbert been successful, our problem of deciding whether the complement of a CFL is empty would be solved. However, in 1931, Kurt Gddel published his famous incompleteness theorem, which proved that no such effective procedure could
exist.
whose very
He
constructed a formula in the predicate calculus applied to integers,
definition stated that
logical system.
this
clarification
there
could neither be proved nor disproved within
and formalization of our
one of the great
Once
it
The formalization of
intellectual
this
argument and the subsequent an effective procedure is
intuitive notion of
achievements of
this century.
the notion of an effective procedure was formalized,
was no
effective
procedure for computing
many
it
was shown
that
specific functions. Actually
from a counting argument. Consider integers onto {0, 1}. These functions can be put into one-to-one correspondence with the reals. However, if we assume that effective procedures have finite descriptions, then the class of all effective procedures can be put into one-to-one correspondence with the integers. Since there is no one-to-one correspondence between the integers and the reals, there must exist functions with no corresponding effective procedures to compute them. There are simply too many functions, a noncountable number, and only a countable number of procedures. Thus the existence of noncomputable functions is not surprising. What is surprising is that some problems and functions with genuine significance in mathematics, computer science, and other disciplines are noncomputable. Today the Turing machine has become the accepted formalization of an the existence of such functions the class of functions
effective
is
easily seen
mapping the nonnegative
procedure. Clearly one cannot prove that the Turing machine model
is
equivalent to our intuitive notion of a computer, but there are compelling argu-
ments
for this equivalence,
which has become known as Church's hypothesis. In in computing power to the digital all the most general mathematical
Turing machine is equivalent computer as we know it today and also to notions of computation. particular, the
7.2
A
THE TURING MACHINE MODEL
formal model for an effective procedure should possess certain properties. First, finitely describable. Second, the procedure should con-
each procedure should be
of discrete steps, each of which can be carried out mechanically. Such a model was introduced by Alan Turing in 1936. We present a variant of it here. sist
TURING MACHINES
148
The
an input tape that and a tape head that scans one cell of the tape at a time. The tape has a leftmost cell but is infinite to the right. Each cell of the tape may hold basic model, illustrated in Fig. 7.1, has a finite control,
divided into
is
cells,
number of tape symbols.
exactly one of a finite
some
>
hold the input, which
Initially, the
n leftmost
cells, for
a string of symbols chosen from a subset of the tape symbols called the input symbols. The remaining infinity of cells each finite
n
0,
hold the blank, which
is
a
a special tape symbol that
a \
is
not an input symbol.
B
B
an
2
is
Finite
control
Basic Turing machine.
Fig. 7.1
In one
move
the Turing machine, depending
upon the symbol scanned by the
tape head and the state of the finite control, 1)
changes
2) prints a
state,
symbol on the tape
cell
scanned, replacing what was written there,
and 3)
moves
its
head
left
or right one
cell.
Note that the difference between a Turing machine and a two-way automaton lies in the former's ability to change symbols on its tape. Formally, a Turing machine (TM) is denoted
M = (Q, Z, T,
3,
finite
q 0 B, F), ,
where
Q
is
the finite set of states,
T
is
the finite set of allowable tape symbols,
B, a
symbol of
X, a subset of
3
is
the next
F,
F
is
the blank,
not including B,
is
the set of input symbols,
move function, a mapping from
Q
x
F
to
Q
x
F
x
{L,
R}
(d
may,
however, be undefined for some arguments), q 0 in Q is the start state, F ^ Q is the set of final states.
We
denote an instantaneous description (ID) of the Turing machine Here q, the current state of M, is in Q; a x a 2 is the string in F* that
M by is
the
contents of the tape up to the rightmost nonblank symbol or the symbol to the
left
cc
1
qcc 2
'
of the head, whichever
is
rightmost. (Observe that the blank
B may
occur
in
a x a 2 .)
7.2
THE TURING MACHINE MODEL
|
149
We assume that Q and T are disjoint to avoid confusion. Finally, the tape head is assumed to be scanning the leftmost symbol of a 2 or ,
=
a2
if
the head
e,
scan-
is
ning a blank.
We
define a
Suppose
then there
(
where
Y, L),
)
i
X X2
••
X^^qXi
X
However,
if
deleted in
(7.1).
any
X X2 l
n
-X
t
.
qX
l
—
i
longer than the
1
•••
t
l
X = (p, t
is
)
2
then
{
l
•
i
-2
pX
Then we
X --X 2
•
•
_
i
•
X
• •
x
be an ID.
n
taken to be B. If
is
1
fall
off the
YX i+1
=
z
X„
is
i
_
1,
end of
left
••
(7.1)
completely blank, that suffix
is
Y, K).
the string X,
n,
j
X
not allowed to
X,,
n \
qX
•
t
n,
X
-X w X
X i+l =
=
1
1
\
X^ YX i+l
suffix of
that in the case
(7.2) is
-
i
X w X X2
••
Alternatively, suppose d(q,
Note
if
no next ID, as the tape head > 1, then we write
is
the tape. If
M as follows. Let X X
move of
X = (p,
d(q,
is
write:
l
X
YpX i+l -
(7.2)
n.
empty, and the right side of
left side.
If two ID's are related by (^-, we say that the second results from the first by one move. If one ID results from another by some finite number of moves, including zero moves, they are related by the symbol f^-. We drop the subscript from or|^-when no confusion results. [ftThe language accepted by M, denoted L(M\ is the set of those words in Z* that cause to enter a final state when placed, justified at the left, on the tape of M, with in state q 0 and the tape head of at the leftmost cell. Formally, the = (Q, Z, T, 3, q 0 B, F) is language accepted by
M
M M
{w|w
M
,
M
,
Z* and q 0 w \^
in
u.
x
for
pv. 2
some p
in F,
and
ol
and a 2
x
in T*}.
TM recognizing a language L, we assume without loss of generality accepted. TM halts, has no next move, whenever the input However, for words not accepted, possible that the TM will never halt. Given a
that the
is
i.e.,
it is
Example
The design of a
7.1
TM M
given below. Initially, the type of
Repeatedly, ing
it
by
leftmost
M replaces the leftmost 0 by X
moves left 0 and repeats 7,
blank instead, then finds
to accept the language
no more
O's,
M
then
L = {OT
|
n
>
1} is
M contains OT followed by an infinity of blanks. y
moves
right to the leftmost
to find the rightmost X, then
the cycle.
moves one
however, when searching for a
If,
1,
replac-
cell right to
changing a
1,
halts without accepting.
If,
after
M checks that no more
l's
remain, accepting
1
the
M finds a to a 7, M if
there are
none. Let
Q=
{tf 0 ,
qu
ft,
q*\
Z=
{0, 1},
Y=
{0,
1,
X
y
Y, B} 9
and F
=
{tf 4 }.
Informally, each state represents a statement or a group of statements in a
program. State q 0
ment of a
Ts
until
is
entered
leftmost 0 by an it
initially
X. State q
finds the leftmost
1.
If
y
and also immediately prior to each replaceis used to search right, skipping over O's and
M finds a
1 it
changes
it
to Y, entering state q 2
.
.
TURING MACHINES
150
State q 2 searches the leftmost 0, as
left it
for
X and enters state q 0 upon finding
an
changes
state.
encountered before a the input
is
M searches right
As
1, then the input not in 0*1*.
is
it,
moving right, to
q u if a B or X is rejected; either there are too many O's or in state
State q 0 has another role. If, after state q 2 finds the rightmost X, there is a Y immediately to its right, then the O's are exhausted. From q 0 scanning Y, state q 3 ,
is
entered to scan over
Fs and check
that
no
l's
remain. If the
Fs
are followed by
a B, state q 4 is entered and acceptance occurs; otherwise the string is rejected. The on input function d is shown in Fig. 7.2. Figure 7.3 shows the computation of
M
0011. For example, the
move
first
is
explained by the fact that d(q 0 0) ,
=
X, R); the last move is explained by the fact that d(q 3 B) = (g 4 B, R). The reader should simulate on some rejected inputs such as 001 101, 001, and Oil. (q l9
,
,
M
Symbol 0
State
<4i
4i
<4i , 0,
42
(42 , 0, L)
R)
B
y
X, R)
4o
,
X
1
tel. y>
n) r)
(42,
n
l)
(43,
Y R)
(43, y,
(q 2
,
y,
L) (4o,
X, R)
43
9
(44, B,
R)
44
The function
Fig. 7.2
g 0 0011
\-XXq
XXq 2 YY\—Xq 2 XYY
I- xx^o
XXYYq 3
|—
A
x
h^oyi
Y\ |—
XXYq
x
h\
|—
h xxy^ y h
XXYYBq
Fig. 7.3
7.3
\-xo qi 11
1—^,011
q 2 X0Y\ [-Xq o 0Y\
5.
A
computation of
M
COMPUTABLE LANGUAGES AND FUNCTIONS
language that
is
accepted by a Turing machine
is
said to be recursively enumer-
The term "enumerable" derives from the fact that it is precisely these languages whose strings can be enumerated (listed) by a Turing machine. "Recursively" is a mathematical term predating the computer, and its meaning is similar to what the computer scientist would call "recursion." The class of r.e. languages is able
(r.e.).
very broad and properly includes the CFL's.
The class of r.e. languages includes some languages for which we cannot mechanically determine membership. If L(M) is such a language, then any Turing
7.3
COMPUTABLE LANGUAGES AND FUNCTIONS
|
151
machine recognizing L(M) must fail to halt on some input not in L(M). If w is in eventually halts on input w. However, as long as L(M), is still running on some input, we can never tell whether will eventually accept if we let it run long enough, or whether will run forever.
M
M
M
M
convenient to single out a subclass of the
It is
r.e. sets,
called the recursive sets,
which are those languages accepted by at least one Turing machine that halts on all inputs (note that halting may or may not be preceded by acceptance). We shall see in Chapter 8 that the recursive sets are a proper subclass of the r.e. sets. Note also that by the algorithm of Fig. 6.8, every CFL is a recursive set.
The Turing machine as a computer of
integer functions
In addition to being a language acceptor, the Turing machine
computer of functions from integers to
integers.
represent integers in unary; the integer
i
function has k arguments,
by
the tape separated
l's,
i
u
i2
as 0
ll
10
I2
l
0
is
may be viewed as a
traditional
•••
approach
is
to
represented by the string 0\ If a
then these integers are
ik ,
,
>
The
initially
placed on
10*.
TM halts (whether or not in an accepting state) with a tape consisting of
If the
0m for some m, then we say that f(i u i 2 i ) = m, where /is the function of k k may compute a arguments computed by this Turing machine. Note that one ,
TM
two arguments, and so on. Also
function of one argument, a different function of
note that value for
if
TM M
If f(i u
A
function.
computes function
all different /c-tuples
function f(i u
i
i
arguments, then / need not have a
u
i
k.
we say / is a total recursive u computed by a Turing machine is called a partial
defined for
is
ik )
/ of k
of integers
k)
all
ik ,
i
then
recursive function. In a sense, the partial recursive functions are analogous to the
computed by Turing machines that may or may not The total recursive functions correspond to the recursive they are computed by TM's that always halt. All common arith-
languages, since they are
r.e.
halt
on a given
languages, since
input.
metic functions on integers, such as multiplication,
n!, flog 2 n]
and 2 2 " are
total
recursive functions.
Example for
m<
7.2
n.
The
Proper subtraction
M=
q l9
({q 0 ,
...
defined below, started with 0
repeatedly replaces
its
0 and changes the 0 to repeats the cycle. i)
m—
n
is
defined to be
m—
n for
m>
n,
and zero
TM
The
replaces the n
+
1
q 6 },
{0, 1}, {0, 1, £}, 6,
on
its
q 0 B, ,
tape, halts with
0m
0) "n
leading 0 by blank, then searches right for a 1.
Next,
M moves
repetition ends
Searching right for a
been changed to
9
m 10"
0,
l's,
l's
left
until
it
on 1
its
tape.
encounters a blank and then
if
M encounters a blank. Then, the n
and n + m by a 0 and n &s, leaving 1
M
followed by a
of the
O's
m O's in 0 10" have have been changed to B.
m—
n O's on
its
tape.
all
M
TURING MACHINES
152
ii)
Beginning the
m
remaining
The 1)
and
l's
function d
%o,
M cannot find a 0 to change to a blank, because the Then n > m, so m - n = M replaces
cycle,
first
already have been changed.
O's
0)
=
by
O's
0.
all
B.
described below.
is
R)
(q u B,
Begin the cycle. Replace the leading 0 by B. 2)
%„ 0)=(q %.,!) =
0,R) I R)
ly
(q 2 ,
Search right, looking for the 3) d(q 2
,
1)
=
1,
(q 2 ,
first 1.
R)
<%2,0)=(
0.
Change
that
0 to
1.
= (q 3 0, L) = 3 1, L) Hq 3 B) = (q0 B K)
4) d(q 3 , 0)
,
5( qi , 1)
(<7
,
,
,
Move
left
= = 0) = B) =
5) d(q 2y B)
%
4 , 1)
<%4, 5(q>,
to a blank. Enter state q 0 to repeat the cycle. B, L)
(
B,L)
94)
0,
(
L)
R)
(q 6 , 0,
B
is encountered before a 0, we have situation (i) described q2 above. Enter state g4 and move left, changing all l's to B's until encountering a B. This B is changed back to a 0, state q 6 is entered, and halts.
If in state
a
M
6)
%o, <5(
,
S(q 5 ,
1)
0)
= =
fas, B,
(q 5 , B,
R) R)
l)=(q 5 ,B,R)
S(q 5 ,B)=(q 6 ,B,R) If in state
q0 a
1
is
encountered instead of a
exhausted, as in situation tape, then enters q 6
A
and
o 0010
above.
0,
the
first
M enters state q
5
block of O's has been
to erase the rest of the
halts.
sample computation of qr
(ii)
M on input 0010
— B^r,010
1
is:
|— BO^IO |— B01<7 2 0 |—
BO^ll |— B43OII \—q3 B0U \—Bqo 0ll \—
BB qi 1 1 1— BBlq 2 l \- BB1 lq 2 |— BBq>l I
B
\-B0q6
BB1
|—
7.4
On
TECHNIQUES FOR TURING MACHINE CONSTRUCTION
|
153
M behaves as follows:
input 0100,
4 0 0100
|— Bq t 100
|— Bl^OO |— Bq 3 1 10 |—
q 3 B\\0
f— Bq 0 UO
[— BBq 5 10 \— BBBq 5 0 \—
BBBBq 5 |— BBBBBq 6
TECHNIQUES FOR TURING MACHINE CONSTRUCTION
7.4
Designing Turing machines by writing out a complete set of states and a nextfunction is a noticeably unrewarding task. In order to describe complicated Turing machine constructions we need some "higher-level" conceptual tools. In
move
this section
we
Storage in the
shall discuss the principal ones.
control
finite
The finite control can be used state
symbol. only.
to hold a finite
amount of information. To do so,
written as a pair of elements, one exercising control
is
It
No
arrangement
is
for conceptual
purposes
M
that looks at the first input symbol, Consider a Turing machine and checks that the symbol does not appear else-
7.3
it
where on
this
modification in the definition of the Turing machine has been made.
Example records
should be emphasized that
the
and the other storing a
in its finite control, its
Note
input.
that
M
accepts a regular
but
set,
M
will serve for
demonstration purposes:
M = (Q, where
Q
is
component
B} S [q 09 B] B, F), 9
9
9
Q F
consists of the pairs [q 0 , 0], [q 0 , 1], That is, is [q u B\ The set {[q l9 B]}. The intention is that the of the state controls the action, while the second component
{q 0 , q x ]
[q 09 £], [q l9 0], [q l9 first
x
{0, 1}, {0, 1,
1],
{0, 1, B}.
and
"remembers" a symbol.
We 1) a)
define 3 as follows.
S([q 0 , B], 0)
Initially,
q 0 is
=
{[q l9 0], 0,
R\
b) S([q 09 B] 9 1)
component of JVf s
state
=
([q l9 1],
1,
R).
M
moves right. The first component of the state, and becomes q l9 and the first symbol seen is stored in the
the control
second component. 2) a) S([q u 0], 1) If
=
([
1,
R),
M has a 0 stored and sees a
right.
1
b) S([q it
1],
or vice versa, then
0)
M
=
1], 0,
continues to
R).
move
to the
154
TURING MACHINES
3) a) S([q u 0],
M enters first
B)
=
b) S([q u
([q l9 B], 0, L),
the final state
B]
1],
B)
=
(fo, B],
0, L).
reaches a blank symbol without having
if it
encountered a second copy of the leftmost symbol.
M
If reaches a blank in state [q u 0], or [q l9 1], it accepts. For state [q u 0] and symbol 0 or for state [q l9 1] and symbol 1, 5 is not defined. Thus if encounters the tape symbol stored in its state, halts without accepting.
M
M
we can allow the finite control
In general,
to have k components, all but
one of
which store information.
Multiple tracks
We
can imagine that the tape of the Turing machine
any
finite k.
tape are considered /c-tuples,
4
1
0
B
B
B
B
1
0
is
divided into k tracks, for
shown in Fig. 7.4, with k = 3. The symbols on one component for each track.
This arrangement
is
1
1
1
$
B
B
B
1
0
1
B
B
B
0
1
0
1
B
B
B
1
the
Finite
control
Fig. 7.4
Example
A
three-track Turing machine.
Turing machine that takes a and determines whether it is a prime. The input is surrounded by $ and $ on the first track. Thus, the allowable input symbols are ft, B, B], [0, B, B], [1, B, B], and [$, B, B]. These symbols can be identified with 0, 1, and $, respectively, when viewed as input symbols. The blank symbol can be identified with [B, B, B]. To test if its input is a prime, the first writes the number two in binary on the second track and copies the first track onto the third. Then the second track is subtracted, as many times as possible, from the third track, effectively dividing the third track by the second and leaving the remainder. If the remainder is zero, the number on the first track is not a prime. If the remainder is nonzero, the number on the second track is increased by one. If the second track equals the first, the number on the first track is a prime, because it cannot be divided by any number lying properly between one and itself. If the 7.4
The tape
binary input greater than
in
2,
Fig. 7.4 belongs to a
written on the
TM
first
track,
|
7.4
second
is
TECHNIQUES FOR TURING MACHINE CONSTRUCTION
|
than the
less
first,
the whole operation
is
repeated for the
155
new number on
the second track.
In Fig.
by
TM
the
7.4,
testing to determine
is
if 47 is
a prime. The
TM
is
dividing
has been subtracted twice, so 37 appears on the third track.
5; already 5
Checking off symbols
how
Checking
off symbols is a useful trick for visualizing guages defined by repeated strings, such as
{ww w |
It is
{wcy
in £*},
also useful
when
w and
|
w
in £*,
y
TM
a
{wwR w
or
y}
recognizes lan-
in £*}.
\
lengths of substrings must be compared, such as in the
languages {flV
We
>
i
{oW \i+J or ; +
or
1}
k}.
introduce an extra track on the tape that holds a blank or yj. The yj in one of its it has been considered by the
TM
appears when the symbol below
comparisons.
Example
Q=
The second component of
The input symbol just
M = (Q, E, T,
Consider a Turing machine
7.5
+
nizes the language {vvcw|w in (a
[B, d]
is
conceptual tools; that
=
b)
+ }.
£=
d]\d
=
[B,
B]
define 1)
is (5
[B, d]
= ku B\
=
d
a,b, or B}.
is
or
c}.
Remember
just another
that the
"name"
two "tracks" are
for d:
=
a, b, c,
blank symbol. For d
=
a or 6 and e
and
F=
or B},
{[g 9 , B]};
=
a or 5
we
as follows.
Bl
[*,
4 [^,4 *)l
checks the symbol scanned on the tape, stores the symbol
control,
and moves
2) 5([q 2 ,d),
[B,e])=([q 2 ,d\,
3) S([q 2 ,d\,[B,c])
finding
c,
=
in the finite
right.
M continues to move On
,
d
and
identified with B, the
8{[q u
M
a, b,
identified with d. is,
q 0 B, F), which recog-
used to store an input symbol,
is
r={[X, d]\X = BoTy/
and
•••,^9
the state {[£,
S,
Let
[B, e], R).
right,
over unchecked symbols, looking for
([q 3 ,d],[B,clR).
M enters a state with
first
component q 3
.
c.
TURING MACHINES
156
4) S([q 3 ,dl[ y/,e])=([q 3 ,d],[y/,e], R).
M moves right over checked symbols. 5)
%,4[M) = fc»l[y,4 4
M
encounters an unchecked symbol. If the unchecked symbol matches the symbol stored in the finite control, checks it and begins moving left. If the symbols disagree, has no next move and so halts without accepting. also halts if in state q 3 it reaches [B, B] before finding an unchecked symbol.
M
M
=
M
4
6) S([q„ B],
y,
d])
M moves
left
over checked symbols.
7) 6([q„ B], [B, c])
=
B],
([
If the
L).
([q 5 , B], [B, c], L).
M encounters the symbol =
8) 5([q 5 ,B], [B,d])
[J,
c.
([q 6 ,B], [B,
d], L).
symbol immediately to the
of c
left
is
unchecked,
M proceeds
left
to find
the rightmost checked symbol.
=
9) 5([q 6 , B], [B, d])
M proceeds
left.
=
([q u B],
[J, d), R). encounters a checked symbol and moves
10) 5([q 6 , B],
M for
[V,
d])
comparison. The
11) 5([q 5 , B],
M will be If
([q 6 , B], [B, d], L).
[V,
d])
=
first
component of
([q 1? B],
in state [q 5 ,
[V,
up another symbol becomes q again.
right to pick
state
l
R).
d),
B] immediately after crossing c moving
a checked symbol appears immediately to the
of c have been checked.
been checked. the
left
of
c,
test
whether
all
(See rule
7.)
all
M will accept.
12) 6([ qi , B], [B, c])
=
([q 8 , B], [B, c], R).
M moves right over 13) S([q a
must
left.
symbols to the left symbols to the right have of c,
they must have compared properly with the symbols to
If so,
so
M
left
,BlU,d]) =
c.
([q e
,Bl[J,dl
R).
M moves to the right over checked symbols. 14) 8([q a If
,
B], [B, B])
M finds
=
([q 9
,
B],
[J, B\
symbol when
its first
L).
M
and accepts. If finds an unchecked component of state is q 8 it halts without accepting.
[B, B], the blank,
it
halts
,
Shifting over
A
Turing machine can make space on
finite
number of
cells to
the right.
its
tape by shifting
To do so,
all
the right, repeatedly storing the symbols read in
its
nonblank symbols a
makes an excursion to finite control and replacing
the tape head
7.4
TECHNIQUES FOR TURING MACHINE CONSTRUCTION
|
157
them with symbols read from cells to the left. The TM can then return to the cells and print symbols of its choosing. If space is available, it can push
vacated
blocks of symbols
Example
M
= (Q, 2, T, S, q 0 £, F), construct part of a Turing machine, occasionally have a need to shift nonblank symbols two cells to the
We
7.6
may
which
a similar manner.
in
left
right.
We suppose
when
it
that
reaches a blank
M's tape does not contain blanks between nonblanks, so it knows to stop the shifting process. Let Q contain states for q = q or q 2 and A and A 2 in T. Let X be a special
of the form [q, A u A 2 ] symbol not used by except
M
B\ The
in state [q u B,
,B BlA
1) S{[q i
9
M stores the on the 2) 6{[q l9
M
M starts the shifting process
relevant portions of the function S are as follows. ([q l
,B,A
l
lX
R)
9
A
for
,
in
T-
{B, X}. state.
first
X
is
printed
M
l
lA 2 )={[q
A
i9
l9
A 2 lX,R)
A and A 2
for
x
T-
in
{ft X}.
symbol in the third component to the second component, stores symbol being read in the third component, prints an X, and moves right.
Hku A u A 2 A 3 = )
],
r-
M
x
shifts the
the 3)
9
,
l
in the shifting process.
symbol read in the third component of its scanned, and moves to the right.
cell
B A
=
1)
,
([q u
A 2 A 3 ], A u R)
A l9
for
,
A2
and
,
A3
in
{b x}. 9
now
repeatedly reads a symbol
symbol previously
A 3y
stores
in the third
it
A2
the third component,
component of
to the second component, deposits the previous second component, A l9 on the cell scanned, and moves right. Thus a symbol will be deposited two cells to the right of its
state, shifts the
in
,
original position.
4)
H[
A i, A 2 ],
B)
When
a blank
is
=
([q l9
A 2 B\ A
l9
,
R)
A and A 2
for
x
in
V-
{B, X}.
seen on the tape, the stored symbols are deposited on the
tape. 5) S([ qi
,A lt
B],
B)
=
([q 2 , B, B],
Au
L).
M
sets the first component of state to symbols have been deposited, q 2 and moves left to find an X, which marks the rightmost vacated cell.
After
all
6) S{[q 29 B,
M moves
Bl A) = left
([q 2
until
we have assumed
an
,
B, B], A, L)
X
is
exists in
found.
Q
for
When X
and resumes
A is
in
Y-
found,
its
{£, X}.
M transfers to a state that
other functions.
Subroutines
As with programs, a "modular" or "top-down" design is facilitated if we use A Turing machine can simulate any
subroutines to define elementary processes.
type of subroutine found in programming languages, including recursive procedures and any of the known parameter-passing mechanisms. We shall here
158
TURING MACHINES
describe only the use of parameterless, nonrecursive subroutines, but even these are quite powerful tools.
The general will
idea
is
have a designated
has no
move and which
TM
design a
TM program to serve as a subroutine; it and a designated return state which temporarily
to write part of a
initial state
will
be used to
a return to the calling routine.
effect
a new
that "calls" the subroutine,
set of states for the
To
subroutine
is
made, and a move from the return state is specified. The call is effected by entering the initial state for the subroutine, and the return is effected by the move from the return state.
Example
TM M
The design of a
7.7
"multiplication"
is
given below.
to
implement the
M starts with 0m 10" on
total recursive function
its
tape and ends with (T
m
n
n
surrounded by blanks. The general idea is to place a 1 after 0 10 and then copy the block of n O's onto the right end m times, each time erasing one of the m O's. The result is \0n \0rn Finally the prefix 10n l is erased, leaving 0™. The heart of the .
a subroutine COPY, which begins in an m n i+n enters an ID 0 lq 5 0 \0 is defined in Fig. algorithm
is
COPY
ID QTlq^lQf and eventually
In state q u on seeing a 0, moves right, to the next and enters state q 2 In state q 2 blank, deposits the 0, and starts left in state q 3 In state q 3y moves left to a 2. On reaching a 2, state q is entered and the process repeats until the 1 is encountered, .
M changes
it
to a 2
.
7.5. ,
M
M
.
x
signaling that the copying process
back to
O's,
and the subroutine
is
0 4i
(42, 2,
42
(4i. 0,
R) R)
43
(43. 0,
L)
(44,
1,
L)
1,
R)
(43,
1,
L)
(4., 2,
R)
(45,
1,
R)
(44, 0,
L)
To complete the program for m n m~ lfliCl. That q 0 0 \0 to B0
S for subroutine
multiplication,
1
is,
we need
Sfae.
= 0) =
8(q 69
l)=(q l9
%o,
0)
is
used to convert the
2's
2
(42.
Fig. 7.5
4
.
1
44
ID
complete. State
halts in q 5
COPY. we add
states to convert initial
the rules
(q 6 ,B,R% (q 69 0, R), 1,
R).
~ nl to needed to convert an ID £ 0m lg 5 0n 10 B i^OMO"', which restarts COPY, and to check whether i = m, that is, all m O's have been erased. In the case that i = m, the leading 10" 1 is erased and the computation halts in state q l2 These moves are shown in Fig. 7.6.
Additional i+ \Qm-i- 1
states
I
are
.
I
MODIFICATIONS OF TURING MACHINES
7.5
(47, 0,
L)
(
0,
L)
toio, B,
(
0,
L)
too, By
Hi
(tfs, 1,
159
L)
R) R)
qio
qu
ton, B,
Additional moves for
Fig. 7.6
Note
that
(qu,B,R)
i?)
TM
performing multiplication.
we could make more than one call to a subroutine if we rewrote the new set of states for each call.
subroutine using a
MODIFICATIONS OF TURING MACHINES
7.5
One
reason for the acceptance of the Turing machine as a general model of a computation is that the model with which we have been dealing is equivalent to many modified versions that would seem off-hand to have increased computing power. In
we
this section
give informal proofs of
some of
these equivalence
theorems.
Two-way
infinite
tape
A Turing machine
with a two-way infinite tape
We
well as to the right.
TM. We
is
denoted by
M = (Q, £, T,
<5,
q0
,
name implies, the tape is infinite to the left as denote an ID of such a device as for the one-way infinite
B, F), as in the original model.
As
its
is an infinity of blank cells both to the left and right of the current nonblank portion of the tape. The relation which relates two ID's if the ID on the right is obtained from the one on the left by a single move, is defined as for the original model with the
imagine, however, that there
exception that if 5(q, X) = (p, Y, L), then qXa \jf pBYa (in the original model, no move could be made), and if d(q, X) = (p, B R), then qXa pa (in the original,
—
y
\
the
B would The
ID
initial
model, there
proceed
appear to the
left
is
no
is
left
as far as
left
of
p).
q 0 w. While there was a left end to the tape in the original end of the tape for the Turing machine to "fall off," so it can
it
wishes.
The
relation
f^-,
L is
Theorem
7.1
and only
if it is
two ID's if the one by some number of moves.
as usual, relates
on the right can be obtained from the one on the
left
recognized by a Turing machine with a two-way
recognized by a
TM
infinite
tape
if
with a one-way infinite tape.
TM
TM
with Proof The proof that a with a two-way infinite tape can simulate a a one-way infinite tape is easy. The former marks the cell to the left of its initial head position and then simulates the latter. If during the simulation the marked cell is
reached, the simulation terminates without acceptance.
TURING MACHINES
160
Conversely, tape that cells
is
M
let
We
infinite tape.
2
=
(Q 2
construct
M
x
,
,
infinite to the right only.
of Af 2 's tape to the right
TM
S 2y q 2 B, F 2 ) be a a Turing machine simulating
E 2 T2
,
of,
and
,
,
M
x
will
2 and having a have two tracks, one to represent the
including, the tape cell initially scanned, the
other to represent, in reverse order, the
cells
to the
M
and
M
The
relationship between the tapes of
M
the initial cell of the
-1, -2,
left
numbered
2
2
x
is
the cells to the right
0,
with a two-way
M
of the
initial
cell.
shown
in Fig. 7.7,
with
1, 2, ...
,
left
and the
cells to
....
A_ 5 A_ 4 A^^ A_ 2 A_\
AQ
A x
A2
4
A5
A3
A4
A5
(a)
A2
^0
A3
4-1 A_ 2
4
•4
-4-3 -4-4 -4-5
(b)
Fig. 7.7
The that
it
of
first cell
M/s
the leftmost
is
(a)
Tape of
M
2 . (b)
Tape of
M
x
tape holds the symbol $ in the lower track, indicating tells whether The finite control of x 2 would be
M
M M evident that M can be constructed to simulate M
cell.
scanning a symbol appearing on the upper or on the lower track of It
should be
fairly
M
sense that while
x
.
x
to the right of the initial position of
2 is
its
2 , in
input head,
the
M
x
M
works on the upper track. While 2 is to the left of its initial tape head position, works on its lower track, moving in the direction opposite to the direction in which are symbols with a blank on the lower 2 moves. The input symbols of x be track and an input symbol of 2 on the upper track. Such a symbol can identified with the corresponding input symbol of 2 B is identified with [B, B\ — (Q u Z 1? r l5 <5„ q u B, F t ). The We now give a formal construction of states, Q l9 of are all objects of the form [q, U]or [q D], where q is in Q 2 plus is the symbol q Note that the second component will indicate whether working on the upper (U for up) or lower (D for down) track. The tape symbols in r are all objects of the form [X, Y], where X and Y are in T 2 In addition, Y may be a symbol not in T 2 H consists of all symbols [a, B], where a is in S 2 F is U], [q, D] \q is in F 2 }. We define 3 X as follows. {[q,
M
x
M
M
M
M
M
M
.
x
y
x
,
M
.
x
.
t
.
1)
x
For each a
5Mu If
M
2
in
[a,
moves
end of tape,
S2 u
B])
=
right
sets its
.
l
t
{£}, ([q y
U\
[X,
fl,
R)
M
if
S 2 (q 29 a)
=
(q,
X, R).
its first move, prints § in the lower track to mark the 1 second component of state to U, and moves right. The first
on
7.5
MODIFICATIONS OF TURING MACHINES
|
component of M^s state holds the state of the symbol X that is printed by 2
M
2)
For each a 8i(q»
in
E2 u
=
B])
[*>
M
2.
On
the upper track,
M
161
prints
x
.
{B},
([q,
D\
[X, ft
R)
if
«5 2
=
(g 2 , a)
X,
(q 9
L).
M
Af 2 moves left on its first move, M, records the next state of 2 and the symbol printed by 2 as in (1) but sets the second component of its state to D and moves right. Again, is printed in the lower track to mark the left end of
If
M
the tape. 3)
For each [X, Y]
in
a, (fe u], [x, y])
Af ! simulates 4)
M
For each [X, Y]
S&q,
2
r„
with
=
P ui
([
Y+
[z, y],
,
on the upper
D], [X, Y])
=
A = L or
^)
R,
x) =
(p,
z,
=
(p,
Z, I).
if
<5
if
5 2 fa, Y)
2 (g,
track.
[X, Z],
([p, Z>],
/I)
M
Here A is L if A is .R, and >1 is R if ^4 is L. M, simulates of Mj. The direction of head motion of Mj is opposite
= Mb^lt^fl) = ([p. c], [y, ft k)
«ittft u].
5)
Here
C=
C/ if /I
cell initially
=
K, and
scanned by
C=
Z) if
M M
depending on the direction
/t).
Y+^
with
in
and
2
.
in
x
on the lower track
^,x) =
if
A = L.M
2
to that of
X
simulates a
M
2
.
y,^i).
(p,
move
of
M
2
on the
next works on the upper or lower track,
which
M
2
moves.
M
Y
will
always
move
right in
this situation.
Multitape Turing machines
A multitape
is shown in Fig. 7.8. It consists of a finite control with and k tapes; each tape is infinite in both directions. On a single move, depending on the state of the finite control and the symbol scanned by each of the tape heads, the machine can:
Turing machine
k tape heads
1)
change
2) print a 3)
state;
new symbol on each
move each it
its
cell to
tape heads;
the
left
or right, or keep
stationary.
Initially,
shall
of the cells scanned by
of its tape heads, independently, one
the input appears on the
first
tape,
and the other tapes are blank. We is cumbersome and a
not define the device more formally, as the formalism
straightforward generalization of the notation for single-tape TM's. 7.2 If a language L is accepted by a multitape Turing machine, accepted by a single-tape Turing machine.
Theorem
it
is
TURING MACHINES
162
Finite
control
Fig. 7.8
Proof
L be accepted by M a TM with k tapes. We can construct M a TM with 2k tracks, two tracks for each of M tapes. One track records
Let
one-tape
Multitape Turing machine.
2,
l9
the contents of the corresponding tape of
marker
Mj. The arrangement state of
M
t
and the other
is
blank, except for a
holds the symbol scanned by the corresponding head of
in the cell that is
illustrated in Fig. 7.9.
The
finite
M
control of
2
stores the
Af i, along with a count of the number of head markers to the right of
M
2 's
tape head.
Head
X
1
Tape
1
Head
2
Tape
2
Bi
Head
3
X
Tape
3
c,
Fig. 7.9
Each move of
M
Ai
A2
X B2
Bm
c2
cm
Simulation of three tapes by one.
is simulated by a sweep from left to right and then from cell by the tape head of Af 2 Initially, 2 's head is at the leftmost containing a head marker. To simulate a move of 2 sweeps right, visiting u each of the cells with head markers and recording the symbol scanned by each head of When 2 crosses a head marker, it must update the count of head t
right to
left
M
x
.
.
M
M M
M
M
the markers to its right. When no more head markers are to the right, 2 has seen symbols scanned by each of Af/s heads, so Af 2 has enough information to deter-
7.5
M
MODIFICATIONS OF TURING MACHINES
|
163
M
v Now 2 makes a pass left, until it reaches the leftmost head marker. The count of markers to the right enables 2 to tell when it has gone far mine the move of
M M passes each head marker on the leftward pass, updates the tape symbol of M "scanned" by that head marker, and moves the head marker one symbol or right to simulate the move of M Finally, M changes the of M recorded M control to complete the simulation of one move of M the new of M accepting, then M accepts. Note that the simulation of section — that of a two-way tape TM by a one-way tape TM, the simulation was move move. In the present simulation, however, many moves of M are needed to simulate one move In of M since k moves, the heads of M can be 2k apart, takes about Yj=i 2i « 2k moves of M to simulate k moves of M v (Actually, 2k more enough. As
it
2
it
x
left
x
in
x
state
state
2
.
2 's
x
x
is
.
If
2
in
this
first
infinite
for
infinite
2
after
fact,
j.
cells
x
it
2
2
moves may be needed to simulate heads moving to the right.) This quadratic slowdown that occurs when we go from a multitape TM to a single tape TM is inherently necessary for certain languages. While we defer a proof to Chapter 12, we shall here give an example of the efficiency of multitape TM's.
The language L = {wwR w in (0 + 1)*} can be recognized on a by moving the tape head back and forth on the input, checking symbols from both ends, and comparing them. The process is similar to that of Example 7.5. Example
7.8
single-tape
To
|
TM
recognize
L with
The input on one tape the heads in it is
TM,
the input
is
copied onto the second tape.
compared with the reversal on the other tape by moving opposite directions, and the length of the input checked to make sure
even.
Note is
a two-tape
is
that the
number
of
moves used
to recognize
L by
the one- tape machine
approximately the square of the input length, while with a two-tape machine,
time proportional to the input length
is sufficient.
Nondeterministic Turing machines
A
finite control and a single, and tape symbol scanned by the tape head, the machine has a finite number of choices for the next move. Each choice consists of a new state, a tape symbol to print, and a direction of head motion. Note that the nondeterministic TM is not permitted to make a move in which the next state is selected from one choice, and the symbol printed and/or direction of head motion are selected from other choices. The nondeterministic TM accepts its input if any sequence of choices of moves leads to an accepting state. As with the finite automaton, the addition of nondeterminism to the Turing machine does not allow the device to accept new languages. In fact, the combina-
nondeterministic Turing machine
one-way
tion of
infinite tape.
For a given
is
a device with a
state
nondeterminism with any of the extensions presented or to be presented,
TURING MACHINES
164
such as two-way
infinite
or multitape TM's, does not add additional power.
We
and prove only the basic result regarding the simulation of a nondeterministic TM by a deterministic one. leave these results as exercises,
Theorem 73 is
If
L is
accepted by a nondeterministic Turing machine,
accepted by some deterministic Turing machine,
M
M
l5
then
L
2.
M
state and tape symbol of there is a finite number of choices l5 move. These can be numbered 1,2, ... Let r be the maximum number of choices for any state-tape symbol pair. Then any finite sequence of choices can be represented by a sequence of the digits 1 through r. Not all such sequences may represent choices of moves, since there may be fewer than r choices in some
For any
Proof
for the next
situations.
M
2 will
have three tapes. The
generate sequences of the digits
1
first will
hold the input.
through
r in
On the second, M 2 will
a systematic manner. Specifically,
the sequences will be generated with the shortest appearing
Sequences of
first.
equal length are generated in numerical order.
M
For each sequence generated on tape 2, 2 copies the input onto tape 3 and on tape 3, using the sequence on tape 2 to dictate the moves of
then simulates
Mj.
M
If
t
M
i
enters an accepting state,
choices leading to acceptance,
M will accept. But acceptance, M will not accept. simulated,
2
if
M
2
also accepts. If there
is
a sequence of
be generated on tape
will eventually
it
no sequence of choices of moves of
M
2. 2
When
leads to
2
Multidimensional Turing machines Let us consider another modification of the Turing machine that adds no additional
power
— the
finite control,
multidimensional Turing machine. The device has the usual
but the tape consists of a /c-dimensional array of cells
2k directions, for some fixed
infinite in all
Depending on the state and symbol scanned, the device changes state, prints a new symbol, and moves its tape head in one of 2k directions, either positively or negatively, along one of the k axes. Initially, the input is along one axis, and the head is at the left end of the input. At any time, only a finite number of rows in any dimension contain nonblank symbols, and these rows each have only a finite number of nonblank symbols. For example, consider the tape configuration of the two-dimensional TM shown in Fig. 7.10(a). Draw a rectangle about the nonblank symbols, as also shown in Fig. 7.10(a). The rectangle can be represented row by row on a single tape, as shown in Fig. 7.10(b). The *'s separate the rows. A second track may be used to indicate the k.
position of the two-dimensional
We TM,
shall
TMTs
leaving the generalization to
Theorem
7.4
tape head.
prove that a one-dimensional
If
L
is
TM
can simulate a two-dimensional
more than two dimensions
accepted by a two-dimensional
by a one-dimensional
TM M
l
.
TM M
2.
as an exercise.
then
L
is
accepted
7.5
**
MODIFICATIONS OF TURING MACHINES
|
B
B
B
0i
B
B
B
B
B
a2
03
04
05
B
a-j
08
09
B
010
B
B
0n
a l2
013
B
014
015
B
B
016
017
B
B
B
BBBa BBB* BBa 2 ci3a4a 5 B*a 6 a 1 aga 9 Ba oB*Ba ll a i
l
Simulation of two dimensions by one.
Fig. 7.10
(a)
l
2Ci l3
Ba
l
4.a l5 *
BBa
1
6
01
Two-dimensional tape,
7
165
BBB* * One-
(b)
dimensional simulation.
Proof
M
x
represents the tape of
M
2
as in Fig. 7.10(b).
Mi
will also
have a second
we shall describe, and the tapes are two-way infinite. Suppose that 2 makes a move in which the head does not leave the rectangle already represented by 's tape. If the move is horizontal, simply moves its head marker one cell left or right after printing a new symbol and changing the state of 2 recorded in M/s control. If the move is vertical, T uses its second tape to count the number of cells between the tape head position and the * to its left. Then moves to the * to the right, if the move is down, or the * to the left if the move is up, and puts the tape head marker at the corresponding position in the new block (region between *'s) by using the count on the second tape. Now consider the situation when 2 s head moves off the rectangle represented by Mj. If the move is vertical, add a new block of blanks to the left or right, using the second tape to count the current length of blocks. If the move is horizontape used for purposes
M
M
M
t
M
M
x
x
t
M
M
tal, uses the "shifting over" technique to add a blank at the left or right end of x each block, as appropriate. Note that double *'s mark the ends of the region used
to hold blocks, so
room
to
make
M
x
can
the move,
M
tell x
when
it
has augmented
simulates the
move
of
M
all 2
blocks. After creating
as described above.
Multihead Turing machines
A /c-head Turing machine has some fixed number, /c, of heads. The heads are numbered 1 through k, and a move of the TM depends on the state and on the symbol scanned by each head. In one move, the heads may each move independently
left,
Theorem head
+
1
7.5
TM M
Proof k
right, If 2
or remain stationary.
L
The proof tracks
is
accepted by some /c-head
TM M
l9 it is
accepted by a one-
.
on
its
is
similar to that of
Theorem
7.2 for
tape; the last holds the tape of
M
t
multitape TM's.
and the
ith
M
2
has
holds a marker
TURING MACHINES
166
indicating the position of the zth tape head for
< <
1
i
k.
The details are
left
for
an
exercise.
Off-line Turing machines
An
off-line
Turing machine
is
a multitape
TM
whose input tape
is
read-only.
Usually we surround the input by endmarkers, $ on the left and $ on the right. The Turing machine is not allowed to move the input tape head off the region between $
and
$. It
should be obvious that the
TM, and
off-line
TM
just a special case of the
is
no more powerful than any of the models we have considered. Conversely, an off-line TM can simulate any TM by using one more tape than M. The first thing the off-line TM does is copy its own input onto the extra tape, and it then simulates as if the extra tape were M's input. The need for off-line TM's will become apparent in Chapter 12, when we consider limiting the amount of storage space to less than the input length. multitape
therefore
is
M
M
CHURCH'S HYPOTHESIS
7.6
The assumption
that the intuitive notion
of "computable function" can be
identified with the class of partial recursive functions
is
known
as Church's hypoth-
While we cannot hope to "prove" Church's hypothesis as long as the informal notion of "computable" remains an informal notion, we can give evidence for its reasonableness. As long as our intuitive notion of "computable" places no bound on the number of steps or the amount of storage, it would seem that the partial recursive functions are intuitively computable, although some would argue that a function is not "computable" unless we can bound the computation in advance or at least establish whether or not the computation eventually terminates. esis or the
Church-Turing
thesis.
What is less clear is whether the class of partial recursive functions includes all "computable" functions. Logicians have presented many other formalisms such as the A-calculus, Post systems, and general recursive functions. All have been shown to define the tion, abstract
same class of functions, computer models, such
i.e.,
the partial recursive functions. In addi-
as the
random access machine (RAM),
also
give rise to the partial recursive functions.
The 1,
RAM
consists of an infinite
registers capable of holding
sorts
any
of computer instructions.
formally, but
RAM
number of memory words, numbered 0, finite number of arithmetic
each of which can hold any integer, and a
may
it
integer. Integers
We
should be clear that
if
may
be decoded into the usual
shall not define the
we choose a
simulate any existing computer.
RAM
RAM
model more
suitable set of instructions, the
The proof
formalism formalism is as powerful as the formalisms are discussed in the exercises.
is
that the Turing
given below.
machine
Some
other
7.7
|
TURING MACHINES
AS
ENUMERATORS
167
Simulation of random access machines by Turing machines
Theorem tary
7.6
RAM
Proof
We
A Turing machine can simulate a RAM, provided that the elemen-
instructions can themselves be simulated
use a multitape
holds the words of the
RAM
TM
by a TM.
M to perform the simulation. One tape of M
that have been given values.
#0*y0 #l* yi #10*i; 2 #
•••
#i* Vi
The tape looks
like
#
is the contents, in binary, of the ith word. At all times, there will be some number of words of the RAM that have been used, and needs only to keep a record of values up to the largest numbered word that has been used so far. The RAM has some finite number of arithmetic registers. uses one tape to
where v
t
M
finite
M
hold each register's contents, one tape to hold the location counter, which contains the
number of the word from which the next instruction is to be taken, and one memory address register on which the number of a memory word may be
tape as a placed.
Suppose that the first 10 bits of an instruction denote one of the standard computer operations, such as LOAD, STORE, ADD, and so on, and that the remaining bits denote the address of an operand. While we shall not discuss the details of implementation for all standard computer instructions, an example should make the techniques clear. Suppose the location counter tape of holds number i in binary. searches its first tape from the left, looking for # i*. If a blank is encountered before finding # /*, there is no instruction in word i, so the RAM and halt. If # i* is found, the bits following * up to the next # are examined. Suppose the first 10 bits are the code for "ADD to register 2," and the remaining bits are some number j in binary. adds 1 to i on the location counter tape and copies j onto the memory address tape. Then searches for # j* on the first tape, again starting from the left (note that #0* marks the left end). If #j* is not found, we assume word j holds 0 and go on to the next instruction of the RAM. If # j*vj # is found, Vj is added to the contents of register 2, which is stored on its own tape. We then repeat the cycle with the next instruction.
M
M
M
M
M
Observe that although the RAM simulation used a multitape Turing maby Theorem 7.2 a single tape TM would suffice, although the simulation would be more complicated. chine,
7.7
TURING MACHINES AS ENUMERATORS
We have viewed
Turing machines as recognizers of languages and as computers of on the nonnegative integers. There is a third useful view of Turing machines, as generating devices. Consider a multitape TM that uses one tape as an output tape, on which a symbol, once written, can never be changed, and whose functions
M
TURING MACHINES
168
moves
tape head never
Suppose also that on the output
left.
to be the set of w on the output tape.
M,
language generated by
# 's
between a pair of
.
E* such
in
M writes strings
tape,
# We can define G(M), the
over some alphabet E, separated by a marker symbol
that
w is eventually printed
M
Note that unless runs forever, G(M) is finite. Also, we do not require that words be generated in any particular order, or that any particular word be generated only once. If L is G(M) for some TM M, then L is an r.e. set, and conversely. The recursive sets also have a characterization in terms of generators; they are exactly the languages whose words can be generated in order of increasing size. These equivalences will be proved in turn.
Characterization of
Lemma Proof but
all its
M
7.1
If
L
2 's
G(M
input tape.
X ).
TM M u then
some
M
x
word just generated.
Thus L(M 2 )
=
L
with one more tape than
Whenever
continues to simulate
G(M
for
X )
TM M 2
Construct
M
input with the 2
is
by generators
sets
r.e.
prints If
M v Clearly M
#
on
an
is
M M x
.
2
simulates
output tape,
its
they are the same, 2
r.e. set.
M
accepts an input x
2
M
2
M
using
{
compares
accepts; otherwise
if
and only
if
x
is in
G(M,)-
M
is a The converse of Lemma 7.1 is somewhat more difficult. Suppose some r.e. set L ^ E*. Our first (and unsuccessful) attempt at designing a generator for L might be to generate the words in E* in some order w l9 on w 2 Then run accepts, generate w run on w l5 and if w2 is guaranteed to generating w 2 if accepts, and so on. This method works if halt on all inputs. However, as we shall see in Chapter 8, there are languages L that are r.e. but not recursive. If such is the case, we must contend with the possibility that never halts on some w Then 2 never considers w J+1 l
recognizer for
M
,
x
M
M
x
x
M
.
M
x
M
x
f
M
.
x
,
x
,
M
accepts them. and so cannot generate any of these words, even if We must therefore avoid simulating l indefinitely on any one word. To do this we fix an order for enumerating words in E*. Next we develop a method of generating all pairs (i, j) of positive integers. The simulation proceeds by generating a pair (i,j) and then simulating on the ith word, for j steps. We fix a canonical order for E* as follows. List words in order of size, with a k -i}, words of the same size in "numerical order." That is, let E = {a 0 a u and imagine that a is the "digit" in base k. Then the words of length n are the numbers 0 through k" — 1 written in base k. The design of a TM to generate words in canonical order is not hard, and we leave it as an exercise. w, +
2,
M
M
x
v
,
i
t
Example 001,...
7.9
If
E=
{0, 1},
the canonical order
is
£,
0,
1,
00, 01, 10, 11, 000,
7.7
Note
TURING MACHINES
|
1, 2,
.
.
.
in
ENUMERATORS
169
which we generate the shortest we never generate words like
that the seemingly simpler order in
representation of 0,
AS
base k will not work as
which have "leading O's." Next consider generating pairs (i,j) such that each pair is generated after some finite amount of time. This task is not so easy as it seems. The naive approach, (1, 1), (1, 2), (1, 3), ... never generates any pairs with i > 1. Instead, we shall generate pairs in order of the sum i + j, and among pairs of equal sum, in order of increasing I That is, we generate (1, 1), (1, 2), (2, 1), (1, 3), (2, 2), (3, 1), The pair (i, ;) is the {[(/ + ; + ; - 2)]/2 + i}th pair generated. Thus (1, 4), this ordering has the desired property that there is a finite time at which any flo^o^!,
.
.
.
particular pair
TM
A leave
j) is generated.
(i,
generating pairs
(z,
j) in this
construction to the reader.
its
order in binary
We
is
we
easy to design, and
TM
such a
shall refer to
as the pair
generator in the future. Incidentally, the ordering used by the pair generator
demonstrates that pairs of integers can be put into one-to-one correspondence with the integers themselves, a seemingly paradoxical result that was discovered by Georg Kantor when he showed that the rationals (which are really the ratios of two integers) are equinumerous with the integers.
Theorem
A
7.7
language
and only
is r.e. if
if it is
G(M 2 )
for
some
TM M 2
.
set L = L(Mi) can be When ;) is generated, M 2 produces the ith word w in canonical order and simulates M on w for j steps. If M accepts on the yth step (counting the initial ID as step 1), then M 2
With
Proof
Lemma
generated by a
7.1
we have only
TM M 2 M .
to
show how an
r.e.
simulates the pair generator.
2
(i,
x
f
f
x
generates
vv f .
M
in
L
If
w
canonical order for the alphabet of L, and
let
M
x
Surely
it
2
generates
no word not
M
2.
M
2 will
Thus
amount of time
G(M 2 = )
If L is L exactly
Corollary
an
once.
Proof
M
2
w
f
x
for
for
r.e. set,
then there
is
a generator for
described above has that property, since
considering the pair accept
M
L.
word
in
w be the ith word in w after exactly j moves. As
L, let
2 to generate any particular word in any particular number of steps, we know eventually produce the pair (i, j). At that stage, w will be generated by
takes only a finite
canonical order or to simulate that
M
is in
accept
(i,j),
where j
is
exactly the
it
number
L that
enumerates each
generates
vv f
only
of steps taken by
when
M
x
to
.
Characterization of recursive sets by generators
We shall now show that the recursive sets are precisely those sets whose words can be generated in canonical order.
TURING MACHINES
170
Lemma 7.2 If L is recursive, then there is a generator of L in canonical order and prints no other words. L = JJ^My)^ £*, where M L as follows. M 2 generates (on
Let
Proof
generate
M
input. Construct
a scratch tape) the words
time, in canonical order. After generating
M
on every
halts
t
L that prints the words
for
M
some word
w,
M
2
in £*,
simulates
M
M
2 to
one at a on w. If x
M
guaranteed to halt, we know that 2 x x 2 will finish processing each word after a finite time and will therefore eventually consider each particular word in £*. Clearly 2 generates L in canonical order. accepts w,
generates w. Since
is
M
The converse of Lemma
L
is
recursive,
Lemma
aware. In
TM M exists,
7.2
L
generating
but there
Suppose
7.2, that if L can be generated in canonical order then However, there is a subtlety of which we should be we could actually construct However, given a 2 from x
also true.
is
is
Mj
M
no algorithm
generates
L
.
TM
recognizing
L
TM.
to exhibit that
in
halting
The
canonical order.
2
that
w simulates M
natural thing to do
is
to
M either generates w or a word beyond w the canonical order. In the former case, M accepts w, and case, M halts without accepting w. However, L M may never generating the halt word L, so M may generate neither w nor any word beyond. In situation M would not This problem arises only when construct a
TM M
M
we know a
canonical order,
in
on input
1
until
in
latter
in
last
this
is finite,
that halts
all
we know every
finite set is
accepted by a Turing machine
we cannot determine whether a TM generwhich finite set it is. Thus we know that a halting
inputs. Unfortunately,
ates a finite set or,
L
7.8
x
x
if finite,
Turing machine accepting L, the language generated by there is no algorithm to exhibit the Turing machine.
Theorem
is finite,
halt.
2
even though
on
in
2
if
2
after
L
x
is
recursive
if
and only
if
L
is
generated
Mu in
always
exists,
but
canonical order.
was established by Lemma 7.2. For the "if" part, when is a halting Turing machine for L. Clearly, when finite, there is a finite automaton accepting L, and thus Lean be accepted by a TM that halts on all inputs. Note that in general we cannot exhibit a particular halting TM that accepts L, but the theorem merely states that one such TM Proof
L is L is
The "only
infinite,
M
2
if"
part
described above
exists.
7.8 RESTRICTED TURING MACHINES EQUIVALENT TO THE BASIC MODEL
In Section 7.5
we considered
generalizations of the basic
TM model. As we have
have no more computational power than the basic model. We conclude this chapter by considering some models that at first appear less powerful than the but indeed are just as powerful. For the most part, these models will be variations of the pushdown automaton defined in Chapter 5.
seen, these generalizations
TM
Z 7.8
RESTRICTED TURING MACHINES
|
171
we note that a pushdown automaton is equivalent to a nondeterwith a read-only input on which the input head cannot move left,
In passing, ministic
TM
Whenever Thus the storage tape to
plus a storage tape with a rather peculiar restriction on the tape head.
moves
the storage tape head the right of the head
is
left, it
must
print a blank.
always completely blank, and the storage tape
a stack, with the top at the right, rather than the
left
as in Chapter
is
effectively
5.
Multistack machines
A
deterministic two-stack machine
is
a deterministic Turing machine with a read-
only input and two storage tapes. If a head moves printed
on
Lemma
7.3
left
on either
tape, a blank
is
that tape.
An
arbitrary single-tape Turing machine can be simulated by a
deterministic two-stack machine.
Proof
The symbols
to the
left
of the head of the
TM
being simulated can be
stored on one stack, while the symbols on the right of the head can be placed on the other stack.
On
each stack, symbols closer to the TM's head are placed closer
to the top of the stack than
symbols farther from the TM's head.
Counter machines
We
can prove a result stronger than Lemma 7.3. It concerns counter machines, which are off-line Turing machines whose storage tapes are semi-infinite, and whose tape alphabets contain only two symbols, Z and B (blank). Furthermore, the symbol Z, which serves as a bottom of stack marker, appears initially on the cell
scanned by the tape head and
may never appear on any
can be stored by moving the tape head
i
cells to the right
other of Z.
cell.
A
An
stored
integer
i
number
left. We can whether a number is zero by checking whether Z is scanned by the head, but we cannot directly test whether two numbers are equal. An example of a counter machine is shown in Fig. 7.11; § and $ are customarily used for end markers on the input. Here Z is the nonblank symbol on each tape. An instantaneous description of a counter machine can be described by the state, the input tape contents, the position of the input head, and the distance of the storage heads from the symbol Z (shown here as d and d 2 ). We call these
can be incremented or decremented by moving the tape head right or test
x
on the tapes. The counter machine, a count on each tape and tell if that count is zero.
distances the counts
Lemma
7.4
A
then, can really only store
four-counter machine can simulate an arbitrary Turing
ma-
chine.
Proof From Lemma 7.3, it suffices to show that two counter tapes can simulate one stack. Let a stack have k — 1 tape symbols, Z u Z 2 k _ x Then we can ,
.
TURING MACHINES
172
Read-only input
i
$
i i
Finite
control
\ Z
B
...
B
\
// 1
B
B
B
B
B
\
<
Z
B
B
B Fig. 7.11
represent the stack
Counter machine.
Z h Z ••• Z im uniquely by the count in 2 + km ~ j = m + ki m . + m_ 2 +
base k
i2
l
i
Note
,
• • •
i
i
x
(73)
.
that not every integer represents a stack; in particular, those
representation contains the digit 0
Suppose that the symbol
Zh Z
/c
i2
"
Z
r
is
do
whose
base-/c
not.
pushed onto the top
Z im The count associated with Zh Z .
count, the counter machine repeatedly
moves
•
i2
•
•
Z im Z
(right
r is
jk
the head of the
end) of the stack
To get this new
-f r. first
counter one
cell
and the head of the second, k cells to the right. When the head of the first counter reaches the nonblank symbol, the second counter will hold the count jk. It is a simple matter to add r to the count. of the stack were popped,; should be replaced If, instead, the top symbol Z im by L/AA tne integer part of j/k. We repeatedly decrement the count on the first counter by k and then add one to the second count. When the first count is zero, to the
left
the second count will be
To complete
[j//cj.
the description of the simulation,
counter machine can
what symbol
we must show how
the four-
top of each stack. If the count j is stored on one counter, the four-counter machine can copy j to another counter, computing j mod k in its finite control. Note that j mod k is im if j is given by (7.3).
^
Theorem
7.9
A
tell
is
at the
two-counter machine can simulate an arbitrary Turing ma-
chine.
Proof
By Lemma
7.4,
it is
show how to simulate four counters with and {. One counter can represent these Since 2, 3, 5, and 7 are primes, ij k, and { can be
sufficient to
two. Let four counters have counts ij, four by the
number n
=
f
2 3 ;'5*7'.
uniquely recovered from
n.
k,
9
7.8
To so, if 3, 5,
increment
we have another counter
or 7 cells to the right
to the
left.
or
7,
1,
set to zero,
When the first counter holds zero, the second will hold 2n, 3n, 5n, or In, To decrement ij, or by 1, n is, by a similar process, divided by 2,
respectively. 3, 5,
173
we multiply n by 2, 3, 5, or 7, respectively. To do we can move the head of this counter 2, each time we move the head of the first counter one cell
or / by
i,j, k,
RESTRICTED TURING MACHINES
|
/c,
respectively.
We
must also show how the two-counter machine can determine the next The two-counter machine always scans the same cell of the input tape as the four-counter machine does. The state of the four-counter machine is stored in the finite control of the two-counter machine. Thus, to determine the move of the four-counter machine, the two-counter machine has only to determine which, if any, of i j, k, and £ are 0. By passing n from one counter to the other, the finite control of the two-counter machine can determine if n is divisible by 2, 3, 5, 7, or any product of these.
move
of the four-counter machine.
9
Limits on the number of states and symbols
Another way to
restrict
number of states.
If the
all
limited, then there
restricted
model
alphabet, then three states result
is left
to limit the size of the tape alphabet or the
is
number of tapes, and number of states are number of different Turing machines, so the powerful than the original.! If we do not restrict the tape only a
is
is less
TM
a
tape alphabet, finite
and one tape are
as an exercise.
We
L^
1)*
shall,
sufficient to recognize
any
r.e. set;
this
however, prove a result about limited tape
alphabets.
Theorem 7.10
If
with tape alphabet
(0
L= L(Mj),
Proof Let ~ between 2 k
l
+
1
+
and
L is
M
=
r.e.,
then
L is
accepted by a one-tape
TM
{0, 1, B}.
and
where x (Q, {0, 1}, T, 5, q 0 , B, F). Suppose F has 2* symbols, so k bits are sufficient to encode any tape
B) to simulate M M We may design M with tape alphabet symbols of M The consist of a sequence of codes M control of M remembers the of M, and also remembers the position of M
symbol of The tape of
x
{0, 1,
2,
.
for
2 will
k,
so
M
2
can
x
.
finite
.
state
2
modulo symbol of M,. tape head,
x
2 's
know when
it is
at the beginning of a
coded tape
M
M
At the beginning of the simulation of a move of u the head of 2 is at the — 1 symbols to its end of a binary-coded symbol of v 2 scans the next k right, to determine the move of j. Then M, replaces the symbols scanned to reflect the move of positions its tape head at the left end of the code for the { left
M
next t
symbol scanned by
However, there are such
that given as input
universal
TM
M
,
M M
M u and changes the state of M v
restricted
If that state is accepting,
Turing machines that are "universal"
an encoding of a transition function for some
M
accepts machine accepts if and only if with one tape, 5 states, and 7 tape symbols.
w.
For example,
(see Section 8.3) in the sense
TM M and it is
known
an input
that there
w is
to
M,
the
a universal
TURING MACHINES
174
M ready to simulate the next move of M A special head positioned at a blank when should be reading a code a tape symbol of M v In case, M has just moved to a position has never before reached. M must write the binary code M/s blank on the
M
accepts; otherwise,
2
case occurs
if
M
2 is
2
2 finds its
this
for
it
x
for
2
scanned and the k
—
1
.
it
cells to its right, after
which
it
cell
may simulate
a
move
of
M
l
as before. is left to be explained. M 2 's input is a binary string w in w itself, rather than a string of coded 0's and Vs representing w. Therefore, before simulating M u M 2 must replace w by its code. To do so, M 2 uses the "shifting over" trick, using B for the symbol X described in Section 7.4,
One
+
(0
important detail
1)* representing
was introduced. For each input symbol, starting with the symbol is shifted k — 1 places right, and then the symbol and the k — 1 B's introduced are replaced by the /c-bit binary code for where
"shifting over"
leftmost, the string to the right of the
the symbol.
We not
{0,
can apply the same binary coding technique even
We
1}.
therefore state the following corollary
if
the input alphabet
and leave
is
its
proof as an
L is
accepted by
exercise.
Corollary
an
If
off-line
that tape
L
TM
is
an
r.e.
over any alphabet whatsoever, then
set
that has only
is {0, 1,
one tape besides the input, and whose alphabet
for
£}.
Theorem 7.11 Every Turing machine can be simulated by an off-line Turing machine having one storage tape with two symbols, 0 (blank) and 1. The Turing machine can print a 0 or 1 over a 0, but cannot print a 0 over a 1. Proof
We
leave this to the reader.
The
"trick"
is
to create successive ID's of the
machine on the tape of the new one. Tape symbols are, of course, binary. Each ID is copied over, making the changes necessary to
original Turing
encoded
in
simulate a
move
of the old machine.
In addition to the binary encoding of the original symbol, the
simulating needs cells to indicate the position of the head in the
and
cells to indicate that
EXERCISES Design Turing machines to recognize the following languages.
a)
{(TWln >
b)
{ww*|w is in (0+ 1)*}. The set of strings with an equal number of 0's and
c)
7.2
1}.
l*s.
Design Turing machines to compute the following functions,
a) [log 2 n]
b) n!
being copied,
the binary representation of a symbol has already been
copied.
7.1
TM doing the
ID
c) n
2
:
EXERCISES
Show
7.3
that
L
if
is
L
heads per tape, then
accepted by a is
fc-tape,
/-dimensional, nondeterministic
accepted by a deterministic
TM
175
TM with m
with one semi-infinite tape and
one tape head.
A recursive function is a function defined by a finite set of rules that for various arguments specify the function in terms of variables, nonnegative integer constants, the successor (add one) function, the function itself, or an expression built from these by composition of functions. For example, Ackermanris function is defined by the rules
7.4
X(0,y)=l
1)
=2 = x + 2 for x > 2 A(x+l,y+l) = A(A(x y+l),y)
2) A{1, 0)
3) A(x, 0)
4)
t
a) Evaluate A(2,
1).
* b) What function of one variable * c) Evaluate A(4t 3). * 7.5
Give recursive
a) n '*
+m Show
7.6
A(x> 2)?
is
definitions for
b) n
—m
c)
nm
d) n!
that the class of recursive functions
is
identical to the class of partial recursive
functions.
A
7.7
function
primitive recursive
is
if it is
a
finite
and primitive recursion^ applied to constant
xn )
function
** b)
**c)
x,.
7.8
Design a Turing machine to enumerate {(TV
7.9
Show
that every
one accepting * 7.10
that
t
A
r.e.
set
is
accepted by a
TM
\n
>
1}.
with only two nonaccepting states and
state.
Complete the proof of Theorem
overprinted by 7.11
number of applications of composition
the successor function, or a projection
Show that every primitive recursive function is a total recursive function. Show that Ackermann's function is not primitive recursive. Show that adding the minimization operator, min (/(x)) defined as the least x such that / (x) = 0, yields all partial recursive functions.
a)
'*
=
0,
0,
7.11, that tapes
are sufficient for an off-line
Consider an
off-line
is
Show
a definition of /(x„
f(x x
symbols 0 (blank) and
to accept
any
r.e.
1,
with no
1
language.
TM model that cannot write on any tape but has three pebbles
can be placed on the auxiliary tape.
primitive recursion
TM
x„)
=
if
x„
that the
x„)
model can accept any
by
= 0then
0(X,
xB _,)
else
h{x l9
where g and h are primitive recursive functions.
x H ,f(x lt
x„_
t,
xn
-
1))
r.e.
language.
176
TURING MACHINES
BIBLIOGRAPHIC NOTES The Turing machine
is the invention of Turing [1936]. Alternative formulations can be Kleene [1936], Church [1936], or Post [1936]. For a discussion of Church's hypothesis, see Kleene [1952], Davis [1958], or Rogers [1967]. Other formalisms equivalent
found
in
to the partial recursive functions include the A-calculus (Church [1941]), recursive functions
(Kleene [1952]), and Post systems (Post [1943]). Off-line TM's are discussed by Hartmanis, Lewis, and Stearns [1965]. An important result about multihead TM's that they can be simulated without loss of time with one
—
—
Hartmanis and Stearns [1965]. The one case not covered by the latter paper, when the multihead machine runs in real time (a number of moves proportional to the input length), was handled by Fischer, Meyer, and Rosenberg [1972], and Leong and Seiferas [1977] contains the latest on reducing the complexity of that construction. RAM's were formally considered by Cook and Reckhow [1973]. Theorem 7.9, that two counters can simulate a TM, was proved by Minsky [1961]; the proof given here is taken from Fischer [1966]. Theorem 7.11, on TM's that can only print l's over O's, is from Wang [1957]. Exercise 7.9, limiting the number of states, is from Shannon [1956]. In fact, as the latter paper assumes acceptance by halting rather than by final state, it shows that only two states are needed. A number of texts provide an introduction to the theory of Turing machines and recursive functions. These include Davis [1958, 1965], Rogers [1967], Yasuhara [1971], Jones [1973], Brainerd and Landweber [1974], Hennie [1977], and Machtey and Young head per tape
[1978].
is
found
in
CHAPTER
8
UN DECIDABILITY
We now
consider the classes of recursive and recursively enumerable languages.
The most
whose strings are problem of determining if an arbitrary Turing machine accepts the empty string. This problem may be formulated as a language problem by encoding TM's as strings of O's and l's. The set of all strings encoding TM's that accept the empty string is a language that is recursively enumerable but not recursive. From this we conclude that there can be no algorithm to decide which TM's accept the empty string and which do not. In this chapter we shall show that many questions about TM's, as well as some questions about context-free languages and other formalisms, have no algorithms for their solution. In addition we introduce some fundamental concepts interesting aspect of this study concerns languages
interpreted as codings of instances of problems. Consider the
from the theory of recursive functions, including the hierarchy of problems induced by the consideration of Turing machines with "oracles."
8.1
PROBLEMS
Informally
we
use the
CFG ambiguous?" problem
is
word problem
to refer to a question such as: "Is a given
In the case of the ambiguity problem, above, an instance of the
a particular
CFG.
In general, an instance of a
problem
is
a
list
of
arguments, one argument for each parameter of the problem. By restricting our
problems with yes-no answers and encoding instances of the problem by strings over some finite alphabet, we can transform the question of whether there exists an algorithm for solving a problem to whether or not a particular language is recursive. While it may seem that we are throwing out a lot of imporattention to
177
UNDECIDABILITY
178
tant problems by looking only at yes-no problems, in fact such
Many
is
not the case.
general problems have yes-no versions that are provably just as
difficult as
the general problem.
Consider the ambiguity problem for CFG's. Call the yes-no version
AMB. A
more general version of the problem, called FIND, requires producing a word with two or more parses if one exists and answering "no" otherwise. An algorithm for FIND can be used to solve AMB. If FIND produces a word w, then answer "yes";
if
FIND
answers "no," then answer "no." Conversely, given an algorithm
AMB we can produce an algorithm for FIND. The algorithm first applies AMB to the grammar G. If AMB answers "no" our algorithm answers "no." If AMB answers "yes," the algorithm systematically begins to generate all words for
over the terminal alphabet of G. As soon as a word
more parse
w is generated, it
is
tested to see
Note that the algorithm does not begin generating words unless G is ambiguous, so some w eventually will be found and printed. Thus we indeed have an algorithm. The portion of the algorithm that tests w for two or more parses is left as an exercise. The process whereby we construct an algorithm for one problem (such as FIND), using a supposed algorithm for another (AMB), is called a reduction (of FIND to AMB). In general, when we reduce problem A to problem B we are showing that B is at least as hard as A. Thus in this case, as in many others, the yes-no problem AMB is no easier than the more general version of the problem. Later we shall show that there is no algorithm for AMB. By the reduction of AMB to FIND we conclude there is no algorithm for FIND either, since the existence of has two or
if it
an algorithm
for
FIND
trees.
implies the existence of an algorithm for
AMB,
a
contradiction.
One
further instructive point concerns the coding of the
Turing machines have a fixed alphabet, we cannot
G=
(K, T, P, S) as the
encoding of
G
grammar
G.
As
all
treat the 4-tuple notation
without modification.
We
can encode
metasymbols in 4-tuples, that is, the left and right parentheses, brackets, comma and ->, be encoded by 1, 10, 100, 5 10 respectively. Let the z'th grammar symbol (in any chosen order) be encoded by + 10' 5 In this encoding, we cannot tell the exact symbols used for either terminals or nonterminals. Of course renaming nonterminals does not affect the language generated, so their symbols are not important. Although we ordinarily view the identities of the terminals as important, for this problem the actual symbols used for the terminals is irrelevant, since renaming the terminals does not affect the ambiguity or unambiguity of a grammar. 4-tuples as binary strings as follows. Let the
,
.
Decidable and undecidable problems
A
problem whose language is recursive is said to be decidable. Otherwise, the problem is undecidable. That is, a problem is undecidable if there is no algorithm that takes as input an instance of the problem and determines whether the answer to that instance
is
"yes" or "no."
8.2
RECURSIVE AND RECURSIVELY ENUMERABLE LANGUAGES
|
179
An unintuitive consequence of the definition of "undecidable" is that problems with only a single instance are trivially decidable. Consider the following problem based on Fermat's conjecture. Is there no solution in positive integers to the equation x + / = z if i > 3? Note that x, y, z, and are not parameters but bound variables in the statement of the problem. There is one Turing machine that accepts any input and one that rejects any input. One of these answers Fermat's conjecture correctly, even though we do not know which one. In fact there may not even be a resolution to the conjecture using the axioms of arithmetic. That is, Fermat's conjecture may be true, yet there may be no arithmetic l
l
i
proof of that
The
fact.
possibility (though not the certainty) that this
is
the case
follows from Godel's incompleteness theorem, which states that any consistent
formal system powerful enough to encompass number theory must have statements that are true but not provable within the system. It should not disturb the reader that a conundrum like Fermat's conjecture is "decidable." The theory of undecidability is concerned with the existence or nonexistence of algorithms for solving problems with an infinity of instances.
PROPERTIES OF RECURSIVE AND RECURSIVELY ENUMERABLE LANGUAGES
8.2
A number
of theorems in this chapter are proved by reducing one problem to
another. These reductions involve combining several Turing machines to form a
TM
has a component for each composite machine. The state of the composite individual component machine. Similarly the composite machine has separate tapes for each individual machine. usually tedious
and provide no
The details of the composite machine Thus we choose to informally describe
insight.
are the
constructions.
TM
Given an algorithm (TM that always halts), we can allow the composite perform one action if the algorithm accepts and another if it does not accept. We could not do this if we were given an arbitrary rather than an algorithm, since if the did not accept, it might run forever, and the composite machine would never initiate the next task. In pictures, an arrow into a box labeled "start" to
TM
TM
assumed to begin operattwo outputs, "yes" and "no," which can be used as start signals or as a response by the composite machine. Arbitrary TM's have only a "yes" output, which can be used for the same indicates a start signal.
ing
when
Boxes with no
"start" signal are
the composite machine does. Algorithms have
purposes.
We now r.e.
turn to
some
basic closure properties of the classes of recursive
and
sets.
Theorem
^oof
8.1
Let
L
The complement k
be a recursive language and
mputs and accepts input w, then
of a recursive language
M
'
L.
Construct
M' from
halts without accepting. If
is
recursive.
M a Turing machine that halts on M so that M enters a state on M halts without accepting, M' enters a
all
if
final
UN DECIDABILITY
180
Construction showing that recursive languages are closed under complementation.
Fig. 8.1
Since one of these two events occurs, M' is an algorithm. Clearly L(M') complement of L and thus the complement of L is a recursive language.
final state.
the
is
Figure
8.1 pictures the
Theorem
construction of
M' from M.
The union of two recursive languages is recursive. The union of two enumerable languages is recursively enumerable.
8.2
recursively
M and M M accepts. M and only M accepts. Since both M rejects, then M simulates M and accepts and M are algorithms, M guaranteed to halt. Clearly M accepts L u L Proof
We
Let
L
L2
and
x
construct
be recursive languages accepted by algorithms
M, which
M,.
simulates
first
If
if
2
M
1
if
If
2
is
2
2.
x
accepts, then
2
t
x
x
.
For recursively enumerable languages the above construction does not work, and may not halt. Instead can simultaneously simulate 2 on accepts. Figure 8.2 shows the two conseparate tapes. If either accepts, then since
M
M
M
x
M
x
M
structions of this theorem.
r—
\f
A/,
(b)
(a)
Construction for union.
Fig. 8.2
Theorem
8.3
able, then
Proof
L
Let
If
M
and
x
M
2
accepts w. Since
will accept.
Note
Thus
that there
accepts, but
it is
that accepts L,
it
M
is
w
its
complement Lare both
recursively enumer-
recursive.
is
L and L respectively. Construct M as in Fig. 8.3 to M 2 M accepts w if M accepts w and rejects w if either L or L, we know that exactly one of M or M 2
accept
2
simulate simultaneously
M
L and
a language
(and hence L)
M
is
will
no a
x
in
and
.
{
t
always say either "yes" or "no," but
will
M
M
or on how long it may take before 2 is an algorithm one or the other will do so. Since
priori limit
certain that
never say both.
follows that
M
L
is
recursive.
x
8.3
UNIVERSAL TURING MACHINES AND AN UNDECIDABLE PROBLEM
|
181
>Yes
Fig. 8.3
Construction for Theorem
8.3.
Theorems 8.1 and 8.3 have an important consequence. Let complementary languages. Then either 1)
L and L are recursive, L nor L is r.e., or of L and L is r.e. but not
L and L
be a pair of
both
2) neither 3)
one
An important
recursive; the other
is
not
r.e.
is to show by complement of the language for that problem is not r.e. above must apply. This technique is essential in proving our
technique for showing a problem undecidable
diagonalization that the
Thus case (2) or (3) first problem undecidable. After that, various forms of reductions may be employed to show other problems undecidable.
UNIVERSAL TURING MACHINES AND AN UNDECIDABLE PROBLEM
8.3
We shall now use diagonalization to show a particular problem The problem
is:
"Does Turing machine
are parameters of the problem. In formalizing the restrict
w
to be over alphabet {0,
1}
and
to be undecidable.
M accept input w?" Here, both M and w problem
as a language
M to have tape alphabet
we
{0, 1, B}.
shall
As the
problem is undecidable, the more general problem is surely undecidable choose to work with the more restricted version to simplify the encoding of problem instances as strings. restricted
as well.
We
Turing machine codes
To
begin,
{0, 1}.
we encode Turing machines with
restricted alphabets as strings over
Let
M = (6, {0,1},
{0, 1,
B} 9 5,q l9 B, {q 2 })
be a Turing machine with input alphabet
{0, 1}
We
Q=
tional tape symbol.
further
and that q 2 is the only final accepted by any TM, then it
assume
state. is
that
Theorem
and the blank as the only addi[q u q 2 ,
.
.
-
,
qn }
7.10 assures us that
accepted by one with alphabet
is
the set of states,
if
{0, 1,
Lc
(0
+
1)*
is
B}. Also, there
1
UN DECIDABILITY
182
is
no need
for
more than one
any
final state in
TM,
since once
accepts
it
it
may as
well halt.
and B by the synonyms X u X 2 X 3 L and R the synonyms D and D 2 respectively. Then a generic move 3(q h X ) = (qk Xe D m is encoded by the binary string } It is
convenient to
respectively.
We
symbols
call
0, 1,
,
,
also give directions
x
,
,
)
,
l
0 10*10* ltflO".
A
binary code for Turing machine
M
(8.1)
is
111 code! 11 code 2 11
•••
11
code r
111,
(8.2)
M
is encoded by where each code,- is a string of the form (8.1), and each move of one of the codecs. The moves need not be in any particular order, so each TM will be denoted (M). actually has many codes. Any such code for Every binary string can be interpreted as the code for at most one TM; many binary strings are not the code of any TM. To see that decoding is unique, note that no string of the form (8.1) has two Ts in a row, so the codecs can be found directly. If a string fails to begin and end with exactly three Ts, has three l's other than at the end, or has two pair of l's with other than five blocks of O's in between, then the string represents no TM. The pair and w is represented by a string of the form (8.2) followed by w. Any such string will be denoted
M
M
Example
8.1
M—
Let
({q l9
q 2 q 3 ), ,
{0, 1}, {0, 1, B}, 3,
q lf
B
f
{q 2 })
have moves:
^
S(q» l)=fe,0,K),
<% 3 ,oH(
%3,
string
n
1)=(
5(l3>B)
Thus one
*>
=
denoted by (M, 101 1)
(q 3y
1,
L).
is
111010010001010011000101010010011
000100100101001 100010001000100101
Note
that
these
may
A
non-r.e.
many
different strings are also
be referred to by the notation
1 1
codes for the pair
(M,
101
101
1
),
and any of
101 1).
language
Suppose we have a the ith word, and
list
M
}
of (0 the
is
+
1)* in canonical order (see Section 7.7),
TM
whose code,
as in (8.2)
is
where
w
f
is
the integer j written in
8.3
|
UNIVERSAL TURING MACHINES AND AN UNDECIDABLE PROBLEM
183
/
12
3
4
Hypothetical table indicating acceptance of words by TM's.
Fig. 8.4
tells for all i and j whether w is in L(Mj). Figure 8.4 suggests such a table ;t 0 means w is not in L(Mj) and 1 means it is. We construct a language Ld by using the diagonal entries of the table to
binary. Imagine an infinite table that
f
£
in Ld To guarantee that no TM accepts Ld we insist that Ld if and only if the (/, i) entry is 0, that is, if M, does not accept w Suppose that some TM Mj accepted Ld Then we are faced with the following contradiction. If Wj is in Ld then the (j, j) entry is 0, implying that Wj is not in L(M,-) and contradicting Ld — L(Mj). On the other hand, if Wj is not in Ld then the (j, j) entry is 1, implying that Wy is in L(M which again contradicts Ld = U{Mj). As Wj is ; either in or not in Ld we conclude that our assumption, Ld — L(M is false. Thus, 7 no TM in the list accepts Ld and by Theorem 7.10, no TM whatsoever accepts Ld
determine membership
w
t
is
.
,
in
f.
.
,
,
),
,
),
.
,
We
have thus proved
Lemma The
Ld
8.1
is
not
r.e.
universal language
Define L^, the "universal language," to be {
w)|M
accepts w}.
"universal" since the question of whether any particular string
accepted by any particular Turing machine
whether
Theorem Proof
is
in L^,
M constructed as
alent to
8.4
We
is
where M' in
is
Theorem
the
M
is
w
We
in (0
call
+
1)*
is
equivalent to the question of
TM with tape alphabet {0,
1,
B) equiv-
7.10.
recursively enumerable.
shall exhibit a three-tape
TM M
x
accepting L^.
The
first
tape of
M
x
is
and the input head on that tape is used to look up moves of the TM when given code (M, w) as input. Note that the moves of are found between will simulate the tape of M. the first two blocks of three l's. The second tape of x
the input tape,
M
M
M
t Actually as all
shown has
low-numbered Turing machines accept the empty
all 0's.
set,
the correct portion of the table
UN DECIDABILITY
184
The alphabet of cell
M
is {0, 1,
B], so
each symbol of M's tape can be held
of M/s second tape. Observe that
would have to use many
cells
of
M
x
if
's
we
tape to simulate one of M's
little more work. The The behavior of Mj is
simulation could be carried out with a state of 1)
M,
with q represented by
0'.
t
in
one tape
did not restrict the alphabet of M, cells,
we
but the
third tape holds the
as follows:
Check the format of tape 1 to see that it has a prefix of the form (8.2) and that there are no two codes that begin with O'lCVl for the same i and j. Also check m that if 0 'l(V10M(yi0 is a code, then 1 < j < 3, 1 < / < 3, and 1 < m < 2. Tape 3 can be used as a scratch tape to facilitate the comparison of codes. t
beyond the second
2) Initialize tape 2 to contain w, the portion of the input
block of three
l's.
Initialize tape 3 to
hold a single
0,
representing q x All three .
tape heads are positioned on the leftmost symbols. These symbols
marked so
the heads can find their
3) If tape 3 holds 00, the
Xj be
code for the
way
may be
back.
final state, halt
and accept.
symbol currently scanned by tape head 2 and let 0' be the current contents of tape 3. Scan tape 1 from the left end to the second 111, looking for a substring beginning 1 lO'lO^l. If no such string is found, halt and has no next move and has not accepted. If such a code is found, let it reject; m be 0'10'10*l(yi0 Then put 0* on tape 3, print X, on the tape cell scanned by head 2 and move that head in direction D m Note that we have checked in (1) that 1 < t < 3 and 1 < m < 2. Go to step (3).
4) Let
the
M
.
.
M accepts (M, w) and only M accepts M runs forever on w, M, will run forever on (M, w), and M halts on w without accepting, M does the same on (M, w). The existence of M sufficient to prove Theorem However, by It is
w.
It is
straightforward to check that
also true that
if
x
if
if
if
{
x
8.4.
is
TM with one semi-infinite tape and alphabet B} accepting We call this particular Z^. TM M the universal Turing ma{0, chine, since does the work of any TM with input alphabet {0,
Theorems
7.2
and
7.10,
we can
find a
1,
M,
it
1}.
By Lemma 8.1, the diagonal language Ld is not r.e., and hence not recursive. Thus by Theorem 8.1, L d is not recursive. Note that L d = {w M, accepts vv,}. We = {(M, w) can prove the universal language accepts w} not to be recursive by reducing Ld to L„ Thus L u is an example of a language that is r.e. but not f
|
recursive. In fact,
Theorem
8.5
Ld is
is
|
M
another example of such a language.
not recursive.
Suppose A were an algorithm recognizing L^. Then we could recognize Ld Given string w in (0 + 1)*, determine by an easy calculation the value of i such that w = w Integer i in binary is the code for some TM M,. Feed to algorithm A and accept w if and only if M, accepts w, The construction is shown in Fig. 8.5. It is easy to check that the constructed algorithm accepts w if
Proof
as follows.
t.
.
8.4
theorem and some more undecidable problems
rice's
|
——>
;
Convert
Yes
,
-Yes
Hypothetical
A
for
No
Lu
185
->No
Constructed algorithm for L d
Reduction of
Fig. 8.5
and only
if
w = w and w
false.
8.4
Lu
Hence
to
Lu
.
is in L(M ). Thus we have an algorithm for Ld we know our assumption, that algorithm A for
-
t
such algorithm
Ld
exists,
f
f
Since
no
exists, is
but not recursive.
is r.e.
THEOREM AND SOME MORE
RICE'S
UNDECIDABLE PROBLEMS
We now have
an example of an r.e. language that is not recursive. The associated problem "Does accept w?" is undecidable, and we can use this fact to show that other problems are undecidable. In this section we shall give several examples of undecidable problems concerning r.e. sets. In the next three sections we shall discuss some undecidable problems taken from outside the realm of TM's.
M
Example
M as
Consider the problem:
8.2
Then
in (8.2).
"Is
L(M) ^ 0?"
Let
(M) denote a code for
define
C = {(M)\ L(M) + 0}
Le = { L(M) = 0}.
and
|
Note that L e and L„ e are complements of one another, since every binary string denotes some TM those with a bad format denote the TM with no moves. All these strings are in Le We claim that L„ e is r.e. but not recursive and that Le is not ;
.
r.e.
We
show that nonempty
string
x accepted by
on input
x.
M
to recognize codes of TM's by constructing a TM nondeterministically guesses a Given input (M,), and verifies that M, does indeed accept x by simulating M,
is r.e.
that accept
M
sets.
M
t
This step can also be carried out deterministically
string (in
we
if
Now we
For pair
7.7.
must show that Le
is
algorithm accepting
k) simulate
Theorem
8.5.
f
it
were.
Then we could
Let
A be
a hypothetical
an algorithm B that, given (M, w), constructs a accepts w. does not accept w and accepts (0 + 1)* if Fig. 8.6. M' ignores its input x and instead simulates
Le There .
(y,
not recursive. Suppose
construct an algorithm for L^, violating is
M
TM M' that accepts 0 M if
The plan of M'
is
shown
on input w, accepting
Note
that
M'
is
if
in
do, but not
not B. Rather,
how
M
M accepts.
"source program" and produces
B must
use the pair
M on the jth binary canonical order) for k steps. If M, accepts, then M accepts (M,).
generator described in Section
it
does
M'
it.
B
is
like
a compiler that takes
as "object program."
The construction of B
(M, w)
as
We have described what
is
simple.
It
takes
(M, w)
.
UNDECIDABILITY
186
>Yes
The
Fig. 8.6
and
Say
isolates w.
w=
a x a2
an
* * *
TM
of length
is
M'.
n.
B creates n +
3 states q l9 q 2 ,
.
.
.
with moves
grt+3
S(q h
%, +2
>
X) =
(q 2 , $,
X) =
(q i+
a _
15
f
*) =
for
any for
l5
X
(print marker),
any
z
X, L) for
X j= $
(find
(q n+3 , ft L),
X)
=
(qn+3)
M
3>
< < n 4-
(erase tape),
=
%, +
and 2
X^£
Having produced the code for these moves, and includes the move
states of
X
&>r
ft
#)
%,+ 2> (5(g„ +3 ,
K)
$)
=
(«» + 4, $>
(print w),
marker).
B then adds n +
*)
1
/* start
up
M
3 to the indices of the
*/
M
in its generated TM. The resulting TM has an extra tape and the moves of symbol $, but by Theorem 7.10 we may construct M' with tape alphabet {0, 1, B}, and we may surely make q 2 the accepting state. This step completes the algorithm ft and its output is the desired AT of Fig. 8.6. Now suppose algorithm A accepting Le exists. Then we construct an algorithm C for 1^ as in Fig. 8.7. If accepts w, then L(M') 0; so A says "no" and C says "yes." If does not accept w, then L(M') = 0,A says "yes," and C says "no." As C does not exist by Theorem 8.5, A cannot exist. Thus, Le is not recursive. If L„ e were recursive, Le would be also by Theorem 8.1. Thus L„ e is r.e. but not recursive. If Le were r.e., then Le and L„ e would be recursive by Theorem 8.3. Thus Le is not all
M
M
r.e.
Example
Consider the language
8.3
L,
= {(M) L(M)
is
recursive}
is
not recursive}.
I
and
= {(M> L(M) I
Note
that L,
language.
is
A TM
not
{(M) |M
halts
loop forever on some words not always
halt,
Suppose
however. L>.
on
all inputs},
although
it
includes the latter
M could accept a recursive language even though M were
We r.e.
in
L(M); some other
claim neither L, nor L„ r
itself might
TM equivalent to M must
is r.e.
Then we could construct a
TM
for
L„ which we know
8.4
|
rice's
theorem and some more undecidable problems
Yes
—
-
A
No
C Fig. 8.7
does not
Algorithm constructed for L„ assuming that algorithm
exist.
that takes
Let
(M, w)
M
r
be a
as input
L(M')
187
TM
accepting L,.
We may
and produces as output a
0
=
if if
A
for
Le
exists.
construct an algorithm
TM M'
A
such that
M does not accept w, M accepts w. M
Note that is not recursive, so M' accepts a recursive language if and only if does not accept w. The plan of M' is shown in Fig. 8.8. As in the previous example, we have described the output of A. We leave the construction of A to the reader.
>
Fig. 8.8
M
The
TM
Yes
M'.
TM
accepting L u shown in Fig. 8.9, Given A and r we could construct a which behaves as follows. On input
TM
M M
TM
is
in Lu,
M
we may Suppose we have a TM nr accepting L nr Then takes B constructed the reader, to accept and an algorithm B, to be by nr (M, w) as input and produces as output a TM M' such that
Now let us turn to L nr
use
.
M
L(M')
= Z* L>,
if
if
.
M accepts w, M does not accept w. >Yes
Fig. 8.9
Hypothetical
TM
for L,
UNDECIDABILITY
188
tYes >Yes
(b)
(a)
Fig. 8.10
Constructions used in proof that
L nr
is
not
TM
M'. (b)
(a)
r.e.
M
for
Lu
.
Thus M' accepts a recursive language if and only if accepts w. Af which B must produce, is shown in Fig. 8.10(a), and a TM to accept L u given B and nr is shown in Fig. 8.10(b). The TM of Fig. 8.10(b) accepts if and only if L(M') is not recursive, or equivalently, if and only if does not accept w. That is, the TM Since we have already shown that accepts (M, w) if and only if (M, w) is in is no such TM exists, the assumption that nr exists is false. We conclude that ',
M
,
M
M
not
r.e.
Rice's
Theorem
for recursive index sets
The above examples show that we cannot decide if the set accepted by a Turing machine is empty or recursive. The technique of proof can also be used to show that we cannot decide if the set accepted is finite, infinite, regular, context free, has an even number of strings, or satisfies many other predicates. What then can we decide about the set accepted by a TM? Only the trivial predicates, such as "Does the TM accept an r.e. set?," which are either true for all TM's or false for all TM's. In what follows we shall discuss languages that represent properties of r.e. languages. That is, the languages are sets of TM codes such that membership of itself. Later we shall (M) in the language depends only on L(M), not on consider languages of TM codes that depend on the TM itself, such as "M has 27 states," which may be satisfied for some but not all of the TM's accepting a given
M
language.
Let
¥ be a set of
r.e.
languages, each a subset of (0
¥
if L property of the r.e. languages. A set L has property example, the property of being infinite is {L| L is infinite}.
¥
is
empty or
¥ consists of
all r.e.
Theorem)
Any
languages. Let
Ly
+
1)*.
is
¥
ff
is
said to be a
an element of is
be the
a
trivial
set
For
property
if
{(M) \L(M)
is
languages
is
7
in .y
}.
Theorem
8.6 (Rice's
nontrivial property
¥ of the
r.e.
undecidable.
Proof
Without
¥). Since ff
is
loss of generality
assume that
nontrivial, there exists
L
0
is
not
in
with property
¥ (otherwise consider ¥ Let M L be a TM .
.
8.4
rice's
|
theorem and some more undecidable problems
accepting L. Suppose accepting
Ly We .
First construct
use
¥
were decidable. Then there
A
an algorithm is
in Sf
The design of M'
is
shown
(M, w>
that takes
if
and only
as input
M'
in Fig. 8.11. First
ignores
M on w. If M does not accept w, then M' does not accept simulates M L on x, accepting x and only M accepts 0 or L depending on whether M accepts w. if
if
Yes
its
x. If
L
x.
is
in JL tt ).
input and simulates
M accepts w, then M' M either accepts
Thus
Yes
Start
as follows.
and produces (M'> as
M accepts w ((M, w)
if
My
an algorithm
exists
M L and M y to construct an algorithm for
output, where L(M')
189
'
->Yes
M,
M'
M' used
Fig. 8.11
in Rice's
theorem.
We may use the hypothetical M y to determine if L(M') is in Sf. Since L(M') is
^
and only if (M, w) is in Z^, we have an algorithm for recognizing L^, a Thus Sf must be undecidable. Note how this proof generalizes Example 8.2. in
if
contradiction.
Theorem marized
some of which
8.6 has a great variety of consequences,
are
sum-
in the following corollary.
Corollary
The following
properties of
r.e.
not decidable:
sets are
a) emptiness,
b) finiteness, c) regularity,
d) context-freedom.
Rice's
Theorem
enumerable index sets
for recursively
The condition under which a set Ly is r.e. is far more complicated. We shall show that Ly, is r.e. if and only if
L
is
in
LcE
and
for
,
some
r.e.
L
,
then
L
in
is
5? (the containment
property). 2) If 3)
L
The
is
an
infinite
language
in £?,
then there
set of finite languages in 9*
Turing
machine
code 1 #code 2 #
.. .,
that
is
is
a
finite
subset of
L
in
y
enumerable, in the sense that there
generates
where code,
is
the
(possibly)
a code for the ith
finite
language
is
a
string
infinite
in
$f
(in
UNDECIDABILITY
190
any
w2
,
We Lemma
prove
If $f
8.2
w2
{w u
vv„}
,
just
is
w l9
L2 =
(0
+
the proof that
TM
L2
accept Lj and
,
M'
not
A
r.e.
Let
is
in either
Lx
and only
if
7
is in i/
if
x
be
in Sf, sets,
that takes as input
shown
in
respectively. If
if
L
is
not
r.e.
L c L2 and let ,
x
we chose L x = (M, w) and pro-
where Mj and M Fig. M accepts w, then M started, and M' accepts or L M does not accept w, then M never so
with the behavior
L(M')
Thus L(M')
is
was the nonrecursive
Construct algorithm
1)*.]
duces as output
x whenever x M' accepts x
with a series of lemmas.
does not have the containment property, then
[For the case where
not be in
and
for the finite language
this characterization
We generalize
Proof
L2
The code
order).
...,w„.
8.12,
2
2 is
2 . If
x
is
in
^
=
L
x
2
if
|Lj
if
and only
if
starts,
2
.
As Lj
c L2
,
M accepts w M does not accept w. >
M does not accept w. Yes-
Yes
/ Yes
Yes
Start
M
A/, A'/'
Fig. 8.12
We again
leave
it
The
TM
M'.
to the reader to design the "compiler"
struct for if
Ly
M'
A
that takes
M
M
and connects them with the fixed Turing machines and x the M' shown in Fig. 8.12. Having constructed A, we can use a
as input
to accept
Lu
,
as
shown
accepts a language in
As such a
TM
We now
does not
exist,
in Fig. 8.13.
M
to con-
TM My
TM accepts if and only and only if does not accept w. is not r.e. y cannot exist, so This
or equivalently,
we know
2
M
if
turn to the second property of recursively enumerable index
>Yes
Fig. 8.13
Hypothetical
TM
to accept
Lu
.
sets.
8.4
Lemma 83 Ly
If
5^, then
M
Let
be a
x
Ly
TM
on
If
to accept
w
algorithm
A
finite
subset of
L
is in
were
We
r.e.
show
shall
L u would be r.e. as follows. A to take a pair as accepts L if w is not in L(M) and that
accepting L. Construct algorithm subset of
finite
input x.
its
fails
some
such that no
191
r.e.
input and produce as output a accepts
L
£f has an infinite language
not
is
Suppose
Proof
theorem and some more undecidable problems
rice's
|
M
L
TM
M'
As shown
in Fig. 8.14,
M'
simulates
M
accepts x, then
t
moves, then
after |x|
that
otherwise.
M'
on w
accepts x.
We
M'
for |x|
simulates
moves.
M M t
If
leave the design of
as an exercise.
Start, simulate for
|jc|
moves Yes
—
Yes
i
w
t
M Not "yes" after \x\ moves
M'
Construction of M'.
Fig. 8.14
M
If w is in L(M), then accepts w after some number of moves, say j. Then L(M') = {x x is in L and |x| < j}, which is a finite subset of L. If w is not in L(M), then L(M') = L. Hence, if does not accept w, L(M') is in y, and if accepts w, L(M'), being a finite subset of L, is not in £f by the hypothesis of the lemma. An argument that is by now standard proves that if L y is r.e., so is L u Since the latter is not r.e., we conclude the former is not either. |
M
M
.
Finally, consider the third property of
Lemma
8.4
If
L#>
then the
is r.e.,
list
r.e.
index
sets.
of binary codes for the
finite sets in
enumerable.
We
Proof
use the pair generator described in Section
7.7.
When
(z,
j)
is
gen-
we treat as the binary code of a finite set, assuming 0 is the code for comma, 10 the code for zero, and 1 1 the code for one. We may in a straightforward erated,
i
M
(0 manner construct a TM (essentially a finite automaton) that accepts exactly the words in the finite language represented by We then simulate the enumeri.
ator for
for j steps. If
it
has printed
M
{i
\
we
print the code for the finite set
i itself, followed by a desymbol #. In any event, after the simulation we return control to the pair generator, which generates the pair following (i, ;").
represented by
i,
that
is,
the binary representation of
limiter
Tneorem 1) If
2) If 3)
8.7
Ly
is r.e. if
and only
if
L is in £f and L c L, for some r.e. L, then L is in 6f. L is an infinite set in £f, then there is some finite subset L of L that
The
set of finite
languages in £f
is
enumerable.
is
in Sf.
UN DECIDABILITY
192
Proof
The "only
(1), (2),
and
L(M)
is
for
M
t
Lemmas
M
M
that recognizes (/,
8.4.
"if" part,
if
suppose
and only
2
,
exists
.
M
(ij) pair.]
M
M
M
.
L y and L ^ L in and (if
let
be any
TM
with
L(M) =
L.
By
condition
(2),
L = L if L is finite). Let E be generated maximum number of steps taken by M to
Sf (take
j be the
E = 0,
accept
M
let
,
let
7=
Then when
1).
if
using the pair generator. In
;)
which is an enumerator of the finite sets in by condition (3). Let L t be the last set com[If there is no set completely printed, generate the next for j steps on each word in L v If accepts all words
simulates
M
and
x
generates pairs
l
For the
8.2, 8.3,
TM M
by 2 Then simulate L u then t accepts . If not, x generates the next (i,y)-pair. We use conditions (1) and (2) to show that L(M t ) = L y Suppose L
pletely printed out
in
is
construct a
We know M 2
steps.
i
part
We
in $f as follows.
response to
y,
if"
(3) hold.
M
t
generates
(i,
;),
there
after
i
is
a
is in
finite
steps of Af 2 ,
accept a
word
if
not sooner,
is
some
M
x
in
L
will
.
Conversely, suppose
Mj
accepts
. Then
there
M
(i,
j)
such that
M
within j steps accepts every word in some finite language L such that 2 generates L within its first i steps. Then L is in y, and L ^ L(M). By condition (1),
L(M)
is
in ff,
so
Theorem 8.7 has a as corollaries
Corollary
is
L^.
We conclude
that
great variety of consequences.
and leave others
The following
1
in
L(M
X )
= L^.
We summarize some of them
as exercises.
properties of
r.e.
sets are
not
r.e.
a)
b) c)
d) e)
0 g)
Proof In each case condition and (g), where (3) is violated.
The following
Corollary 2 a)
L=f=0.
b)
L
c)
w is Ln
d)
(1) is violated,
properties of
contains at least 10 members. in
L,
L
for
some
+ 0.
fixed
word
w.
r.e.
except for
(b),
sets are r.e.
where
(2) is violated,
8.5
|
UNDECIDABILITY OF POST'S CORRESPONDENCE PROBLEM
193
Problems about Turing machines
Does Theorem 8.6 say that everything about Turing machines is undecidable? The answer is no. That theorem has to do only with properties of the language accepted, not properties of the Turing machine itself. For example, the question "Does a given Turing machine have an even number of states?" is clearly decidable. When dealing with properties of Turing machines themselves, we must use our ingenuity. We give two examples.
Example
8.4
It is
undecidable
a Turing machine with alphabet
if
{0, 1,
M M M
B) ever
For each Turing machine we construct which on blank tape simulates M, on blank tape. However, uses 01 to has 01 in cells encode a 0 and 10 to encode a 1. If M,'s tape has a 0 in cell changes the corresponding 1 to 0, then 2j — 1 and 2j. If M, changes a symbol, the paired 0 to 1. One can easily design never has three consecutive so that Fs on its tape. Now further modify prints three so that if M, accepts, consecutive Ts and halts. Thus M, prints three consecutive Ts if and only if prints three consecutive Fs
M
on
its
tape.
f
f,
f
t
M M M t
f
M
t
M
f
accepts
By Theorem 8.6, it is undecidable whether a is in L" is not trivial. Thus the question
£.
predicate
Turing machine ever prints three consecutive Ts
Example
8.5
It is
is
TM
t
accepts
M
e,
f
since the
of whether an arbitrary
undecidable.
decidable whether a single-tape Turing machine started on
cell four or more times. If the Turing machine never scans any cell four or more times, than every crossing sequence (sequence of states in which the boundary between cells is crossed, assuming states change before the head moves) is of length at most three. But there is a finite number of distinct crossing sequences of length three or less. Thus either the Turing machine stays within a fixed bounded number of tape cells, in which case finite automaton techniques answer the question, or some crossing sequence repeats. But if some crossing sequence repeats, then the moves right with some easily detectable pattern, and the question is again decidable.
blank tape scans any
TM
8.5
UNDECIDABILITY OF
POST'S
CORRESPONDENCE PROBLEM
Undecidable problems arise
some of
more
in
a variety of areas. In the next three sections
we
problems in language theory and develop techniques for proving particular problems undecidable. We begin with Post's Correspondence Problem, it being a valuable tool in establishing other problems to be undecidable. An instance of Post's Correspondence Problem (PCP) consists of two lists, A = wu wk and B = x u xk of strings over some alphabet Z. This instance
explore
. .
. ,
the
interesting
.
.
.
,
,
UNDECIDABILITY
194
of PCP has a solution
if
there
is
any sequence of integers
i
l9 i 2 ,
..,
.
*
m with ,
m>
1,
such that Wfp
The sequence Example
8.6
i
l9
.
Let
.
., f
is
m
X=
4
=
3.
2
Wim =
,
Xjj,
x
x fm
-
t
2
,
.
a solution to this instance of PCP.
Let A and B be lists of three strings each, as defined PCP has a solution. Let m = 4, = 2, 2 = 1, 3 = 1, and
{0, 1}.
in Fig. 8.15. In this case z
VV,-
i
i
x
i
Then
w2 w w w3 = x 2 x x x 3 = 1
1
1
List
/i
List
1
1
111
2
10111
10
3
10
0
Fig. 8.15
8.7
Let
I=
{0, 1}.
Let
B
w,
i
Example
101111110.
1
An
instance of
PCP.
A and B be lists of three strings as shown
in Fig.
8.16.
List
i
A
List
Wi
Xi
10
101
2
011
11
3
101
011
1
Fig. 8.16
Another
B
PCP
instance.
i Suppose that this instance of PCP has a solution i u i 2 m Clearly, i t = U no string beginning with w 2 = 011 can equal a string beginning with x 2 = 11; no string beginning with vv 3 = 101 can equal a string beginning with x 3 = 011. We write the string from list A above the corresponding string from B. So far ,
since
we have 10 101
.
.
. ,
.
x 8.5
UNDECIDABILITY OF POST'S CORRESPONDENCE PROBLEM
|
195
selection from A must begin with a 1. Thus i 2 = 1 or i 2 = 3. But not do, since no string beginning with w t w t = 1010 can equal a string
The next i
2
=
1
will
=
beginning with x t x t
101101. With
i2
= 3,
we have
10101
101011 Since the string from
symbol
1,
B
list
again exceeds the string from
a similar argument shows that
i
3
=
i
sequence of choices that generates compatible
B
is
A
modified version of
always one character longer. Thus
We show that PCP
4
=
• • •
strings,
=
and
this instance of
list
Thus
3.
for this
PCP
A
by the single is only one sequence string
there
has no solution.
PCP undecidable by showing that
is
have an algorithm for
L^. First,
we show
that, if
if it
were decidable, we would a modified
PCP were decidable,
PCP would also be decidable. The Modified Post's Correspondence Problem (MPCP) Given lists A and B, of k strings each from Z*, say
version of
A = w l9 w 2
wk
,
does there exist a sequence of integers, wiw^w,., ••
The
difference between the
required to start with the
Lemma
8.5
If
PCP
MPCP
reduces to
Proof
Let
i
i
l9
2,
K>
•
w = x x x i2 l
ir
and PCP on each
string
were decidable, then
k,
sucn tnat
•• x ? Ir
il
is
,
the following:
that in the
MPCP,
a solution
is
list.
MPCP
would be decidable. That
is,
PCP.
A = w l9 w 2 be an instance of the
PCP
MPCP
first
B = x u x2
and
is
wk
,
and
MPCP. We convert
that has a solution
if
and only
if
B = x u x2
,
xk
MPCP to an instance of MPCP instance has a solution. If PCP
this instance of
our
were decidable, we would then be able to solve the MPCP, proving the lemma. Let Z be the smallest alphabet containing all the symbols in lists A and B and let § and $ not be in Z. Let y be obtained from w,- by inserting the symbol $ after y
f
each character of w, and
let z-t
be obtained from x by inserting the symbol $ ahead t
of each character of x,. Create
new words
yo
=
zo
=
z i>
UNDECIDABiLITY
196
C=
Let
y 0 y l9
yk +
,
constructed from the
MPCP
z + fc
For example, the lists C and shown in Fig. 8.17.
.
1
C
List
*i
j
List
yi
Zi
D
1
111
0
414
414141
10
1
H
414141
3
10
0
2
1404141414
4140
3
1404
40
4
$
4$
C
and
D represent an
a solution
a solution.
A and
lists
a solution to
MPCP
Corresponding instances of
lists
if
To
and only
if
D
8.6 are
10111
A and B has
PCP
. ,
B
List
and PCP.
instance of
PCP. We claim
the instance of
see this, note that
if 1,
i
z
l9
that this
MPCP represented by
2>
.
.
.
,
i
r is
a solution to
is
a solution to
B, then
0,
is
. .
Example
1
PCP has
MPCP with
of
2
In general, the
lists
z 0 , z l9
Wi
Fig. 8.17
instance of
D=
A and B
A
List
i
and
x
lists
i
u
i
z
2,
r,
+
k
1
PCP with lists C and D. Likewise, if l9 2 C and D, then = 0 and = k 4- 1, since y 0 i
i
ir
,
and z 0 are the only words with the same index that begin with the same symbol, and y k+1 and zfc+1 are the only words with the same index that end with the same symbol. Let j be the with
lists
i
smallest integer such that
ij
=
symbol $ occurs only as the 1
and
is
i
e
=
k
4- 1.
ir
l
k
Then i u i 2 ij is also a solution, since the symbol of yk+1 and zfc+1 and, for no /, where
4- 1.
last
Clearly
1,
,
.
.
.
,
,
z
2 , «3»
•
•
•
,
ij-
1
is
a solution to
MPCP for lists A
B. If
decide
there
is
an algorithm to decide PCP, we can construct an algorithm to
MPCP by
Undecidability of
Theorem
8.8
converting any instance of
MPCP
to
PCP
as above.
PCP
PCP
is
undecidable.
With Lemma 8.5, it is sufficient to show that if MPCP were decidable, would be decidable whether a TM accepts a given word. That is, we reduce to MPCP, which by Lemma 8.5 reduces to PCP. For each and w we construct an instance of MPCP that has a solution if and only if accepts w. We do this by constructing an instance of MPCP that, if it has a solution, has one that starts with #g 0 w^a l qifii^ "' #a g k /? #, where strings between successive #'s are successive ID's in a computation of with input w, and q k is a final state. Proof then
it
M
fc
fc
M
M
8.5
|
UNDECIDABILITY OF POST'S CORRESPONDENCE PROBLEM
Formally, the pairs of strings forming are given below. Since, except for the
of the pairs
is
first
pair
We
assume there are no moves from a
A
The remaining
#^0 w#
omiir»f*H da ac finllnwc* nairc are givjujj^u lUiiUWa.
I T ict
For each q
II.
A
T ict
X
AT
#
#
in
Q—
R for each
F, p in Q,
A
List
List
Yp
ZqX
pZY
Zq#
pZY#
For each q
III.
in F,
and
and X,
Y,
X
and
in T.
Z
in
T
5
qX
if%,X) = (p, Y,R) if%,X) = (p, Y,L) if%B) = (p, Y,R) if%,B) = (p, Y,L)
Yp# Group
B
List
#
Group
final state.
is:
List
Group
A and B of the instance of MPCP which must be used first, the order
lists
first pair,
irrelevant to the existence of a solution, the pairs will be given
without indexing numbers.
The
197
X
and Y
List
,4
in
F:
List
XqY Xq
q
qY
q
£
q
Group IV List
A
List
g# # Let us say that prefix of y,
and
#
for
a partial solution to
(x, y) is
each g
in F.
MPCP with
lists /I
and x and y are the concatenation of corresponding
is
B
if
ID's.
x
is
lists
a
A
We claim
a partial solution
y)= (#g 0 w#a
1
g1
^ #
•••
1
#^ 0 w#a Moreover,
and
strings of
B respectively. If xz = y, then call z the remainder of (x, y). Suppose that from ID q 0 w there is a valid sequence of /c more
that there (x,
B
this is the
g1
^ #
only partial solution whose larger string
The above statement since the pair (#,
1
is
easy to prove by induction
#g 0 w#) must
be chosen
first.
1
is
•••
#a^).
as long as \y\.
k. It is trivial for
k
=
0,
1
UN DECIDABILITY
198
Suppose that the statement
show
easily
z z.
that
true for
is
true for k
is
it
4- 1.
some k and that qk is not in F. We can The remainder of the pair (x, y) is
= a k qk pk #. The next pairs must be chosen so that their strings from list A form No matter what symbols appear to the right and left of qk there is at most one ,
pair in
Group
that will enable the partial solution to be continued past qk This
II
.
M
from ID move of from Group I. No other choices
pair represents, in a natural way, the
symbols of z force choices
composed of elements
We
oc
k
qk Pk
*
The other
will enable z to
be
in list A.
new
can thus obtain a
straightforward to see that
partial solution, (y,
ID
the one
ya fc+1 q k+i /?fc+1 #). It is can reach on one
that
M
k+l q k+ pk+1 a k q k Pk Also, there is no other partial solution whose length of the second string equals \yai. k + x q k + 1 fik + 1#
move from
is
ot
.
-
|
q k is in F, it is easy to find pairs from Groups I and III which, when preceded by the partial solution (x, y) and followed by the pair in Group IV, with lists A and B. provide a solution to In addition,
if
MPCP
Thus
if
MPCP with no
M,
started in
lists
A and B
ID q 0 w, reaches an accepting state, the instance of does not reach an accepting state, has a solution. If
M
IV may be used. Therefore, there may be partial from B must exceed the string from A in length, so no
pairs from groups III or
solutions, but the string
solution
We input
is
possible.
conclude that the instance of
w halts in
for arbitrary
an accepting
M and w,
it
MPCP has a solution
if
and only
follows that
if
there were an algorithm to solve
then there would be an algorithm to recognize L^, contradicting
Example
8.8
M with MPCP,
Theorem
8.5.
Let
M= and
if
Since the above construction can be carried out
state.
42, 4 3 }, {0,
1,
B}> {0,
1}, 3,
q u 5, {q 3 }\
S be defined by:
let
B)
0)
1,
R)
(
(
L)
(
(
L)
1,
L)
(qi, 0,
R)
(92,
w = 01. We construct an instance of MPCP with lists A and # for list A and #^ 01# for list B. The remaining pairs are:
Let pair
is
Group
1
I
List
0
A
List
0
1
1
#
#
B
B.
The
first
8.5
|
UNDECIDABILITY OF POST'S CORRESPONDENCE PROBLEM
199
Group II T tot P JLlSl i>
A
JLlSl
n n K)
q
1
^
from S(q l9 0)
x
1 /7
from
(5^!,
1)=
from
^(^!,
B)
from S(q 2 0)
1
{/ 2
0g 2 0
U1tt
I
00
,
lq 2 0
q 3 10l Oq,
(92,
1,
(92, 0,
L)
=
(92,
^)t
=
(93, 0,
1,
L)
from <5(g 2 1) = (9., 0, R) from S(q 2y B) = (92, 0, K) >
0^2*
92*
=
Group III
A
List
List
O43O
5
43
0q 3 1
q3
1^0
^3
I43I
^3
0^3
^3
1^3
43
^1
?3
Group IV List
A
List
#
Note
that
M accepts input w = 01 by the sequence of ID's: qfll, lq 2 l, \0q u
Let us see
if
£
there
is
a solution to the
1
g 3 101.
MPCP we have constructed. The first pair
gives a partial solution (#, ftqfilft). Inspection of the pairs indicates that the
only
way
to get a longer partial solution
resulting partial solution
is
(#
The The remainder is now 1#1
is
to use the pair (qfi, lq 2 ) next.
#g 1 01#lg 2
-
).
The next three pairs chosen must be (1, 1), becomes (#g 01#l, #g 01#lg 2 l*l)- The remainder is now <7 2 1#1- Continuing the argument, we see that the only partial solution, the length of whose second string is 14, is (x, x0q t #l), where x = #g 1 01# lq 2 l# 1. Here, we seemingly have a choice, because the next pair used could be (0, 0) or (0^j#, g 2 01#). In the former case we have (xO, xO^j* 10) as a partial solution. But this partial solution is a "dead end." No pair can be added to it to make 1
1
another partial solution, t Since
B
is
never printed,
omit those with
B
so, surely,
we can omit
on one or both
pairs
it
cannot lead to a solution.
where
sides of the state.
B is to the
right of the state.
Group
III pairs
also
UN DECIDABILITY
200
In a similar manner,
we continue
to be forced
by our
desire to reach a solution
to choose one particular pair to continue each partial solution. Finally,
the partial solution
(y,
we
reach
yl#g 3 10), where
y=#q
01#lq 2 l#10q l #lq 20.
l
Since q 3 is a final state, we can now use pairs in Groups solution to the instance of MPCP. The choice of pairs is (1, 1),
(#, #), (q 3 l, q 3 ),
(1, 1),
(#, #),
(0, 0), (1, 1),
I, III,
and IV to
find a
(#, #), (q 3 0, q 3 ),
q 3 ) y (#, #), (4 3 ##, #).
(q 3 l,
Thus, the shortest word that can be composed of corresponding strings from
A and
B, starting with pair
1
lists
is
t^Ol + l^^lO^^l^OW^lOl*^!*^!*^**.
An
application of
PCP
problem can be used to show that a wide variety of probgive only one application here: the undecidability of context-free grammars. The reader should consult the exercises at
Post's correspondence
lems are undecidable.
ambiguity for
We
the end of the chapter for additional applications.
Theorem Proof
8.9
It is
undecidable whether an arbitrary
CFG
ambiguous.
is
Let
A = w u w2 be two
lists
of words over a
finite
B = x u x2
and
w„
,
alphabet Z. Let a u a 2
xn
,
De
,
new
symbols.
Let
LA =
Kw
LB =
{x h x i2 •••x lm a lm « Im _
•
•
•
i2
w im a im a im _
t
ah
\
m>
1}
and
Let
G
be the
P
LA u LB
s u s2
}ju{
-+
fll)
...,*„}, p,
s%
w, 5^ a SA S S B and for 1 < i < n, SA S and S^^x.a,. The grammar G generates the language
contains the productions
S A -+Wia h S B
1}.
CFG ({5,
where
|m>
1
XiS B a iy
-*
,
->
x
,
.
If the instance
(A B) of y
word x^Xfj--- x im a im a im _
PCP •••
l
a i has a solution, say i u i 2 m then there is "' in that equals the word w il w i2 a,-, ,
.
.
.
,
,
w Im fl Im fl im _ ••• a,-, in LB This word has a leftmost derivation beginning S-+Sa> and another beginning S-+ S B Hence in this case G is ambiguous. .
1
.
8.6
VALID AND INVALID COMPUTATIONS OF TM'S
|
G
201
ambiguous. Since the a's dictate the productions any word derived from S A has only one leftmost derivation from S A Similarly, no word derived from S B has more than one leftmost derivation from S B Thus it must be that some word has leftmost deviations from both S A and S B If this word is ya im a im _ •• a h where y is in Z*, then i l9 Conversely, suppose
used,
show
easy to
is
it
is
that
.
.
.
,
l
i
.
2,
.
a solution to PCP.
is
m
i
. ,
Thus G is ambiguous if and only if the instance (A, B) of PCP has a solution. have thus reduced PCP to the ambiguity problem for CFG's. That is, if there were an algorithm for the latter problem, we could construct an algorithm for PCP, which by Theorem 8.8 does not exist. Thus the ambiguity problem for
We
CFG's
is
undecidable.
VALID AND INVALID COMPUTATIONS OF TM'S: A TOOL FOR PROVING CFL PROBLEMS UNDECIDABLE 8.6
While PCP can be reduced easily to most of the known undecidable problems about CFL's, there is a more direct method that is instructive. We shall in this section show direct reductions of the membership problem for TM's to various problems about CFL's. To do so we need to introduce the notions of valid and invalid Turing machine computations. = (Q Z, T, 3, q 0 B, F), for the A valid computation of a Turing machine
M
purposes of 1)
each
2)
w
4)
vv f is
is
an
is
a
l
3) w„
this section,
initial
final
w hr w i
an ID of
We assume without Q
a string
M,
a string
is,
< <
1
i
one
in
,
9
w #wf#w 3 #w4#
•••
1
in
T*QT*
such that:
not ending with B,
ID, one of the form q 0 x for x
ID, that
+1 for
I
is
in
Z*,
r*FT*, and
n.
loss of generality that
Q and T are disjoint, and #
is
in neither
nor T.
The set
set
of invalid computations of a Turing machine
of valid computations with respect to the alphabet
is
the
complement of the
T u Q u
{#}.
The notions of valid and invalid computations are useful in proving many properties of CFL's to be undecidable. The reason is that the set of invalid computations is a CFL, and the set of valid computations is the intersection of two CFL's.
M
Lemma
is the intersec8.6 The set of valid computations of a Turing machine two CFL's, L and L 2 and grammars for these CFL's can be effectively constructed from M.
tion of
Proof
Let
M = (Q, Z, T,
i
I
—
(
X i+\) R
f° r
<5,
i
TM. Both CFL's L and L2 will consist #x m #. We use L to enforce the condition that L2 enforce the condition xf x l+ for even
q 0 B, F) be a
X!#x 2 # and L 2 oc*d
of strings of the form
x
,
x
,
•
•
x
x
to
also enforces the condition that X!
|
is
an
initial
—
ID. That x m
i.
x
is
a final
ID
or
its
1
UNDECIDABILITY
202
reverse tively.
is
To
L3 P
L
enforced by
n L2
Then L x begin,
let
L3
is
x
or
L2
depending on whether
m
the set of valid computations of
be {y#z* |y
z}-
is
odd or even,
easv to construct a
*s
respec-
M.
PDA P to accept
up to the #, checking in its finite control that y is of the form r*gr* In the process, P places on its stack the ID z such that y \jf z, where y is the input before the #. That is, when the input to P is a symbol of T, P pushes that symbol onto the stack. If the input is a state q in Q, P stores q in the finite control and reads the next input symbol, say X (if the next symbol is #, take X to be B). If d(q, X) = (p, 7, R\ then P pushes Yp onto the stack. If 5(q, X) = (p, 7, L), let Z be on top of the stack. Then P replaces Z by pZy (but if the input last read was #, and Y = B, just replace Z by pZ, or by p if Z is also £). After reading the #, P compares each input symbol with the top stack symbol. If they differ, P has no next move and so dies. If they are equal, P pops the top stack .
the input
reads
When the stack is emptied, P accepts. = (L 3 #)*({£} u r*FT*#). By Theorems 5.4 and 6.1, there is an let L algorithm to construct a CFG for L v In a similar way, we can construct a PDA K for L4 = {y #z \y \jf z}. The construction of G 2 for symbol.
Now,
x
l2 = is
then easy, and by
4 0 £*#(L 4 #)*({£}
Theorem
6.1 there is
an algorithm to construct a
L 2 Now Lj n L 2 is #x m # is in Li n L 2 .
x,
is initial,
enforced by
and
L
x
L(G
X )
CFG G 2 for
the set of valid computations of M. That is, if X!#x 2 # ••• K (x, + t ) for odd i; L 2 requires that , then Lj requires that x xf (-^ x i+ for even i. That the last ID has an accepting state is
for
Theorem 8.10 It L(G 2 ) is empty. Proo/
u r*pr*#)
m
is
f
odd and by
L2
for
m
even.
G 2 whether Ufi\) n
undecidable for arbitrary CFG's Gj and
By Lemma 8.6 we can construct from M grammars G and G 2 such that n L(G 2 ) is the set of valid computations of M. If there is an algorithm A to x
whether the intersection of the languages of two CFG's is empty, we can for arbitrary M. Simply B to tell whether L(M) = design B to construct G x and G 2 from as in Lemma 8.6, then apply Algorithm A to tell whether L(G t ) n L(G 2 ) is empty. If the intersection is empty, then there are no valid computations of so L(M) = 0. If the intersection is not empty, L(M) 0. That is, the problem of emptiness for r.e. sets reduces to the problem tell
M
M
TM
0
construct an algorithm
,
of intersection for CFG's.
Algorithm
B
cannot
exist,
however, since L(M)
Theorem 8.6. Therefore A does not tion of two CFL's is empty.
exist,
so
it is
=
0
is
undecidable by
undecidable whether the intersec-
Although two context-free languages are required to represent the valid computations of a Turing machine, the set of invalid computations
reason
is
that
we no
is itself
a
CFL. The
longer need to guarantee simultaneously for each
i
that
8.6
W;
— wi+1
|
one
i
Lemma
.
,
Proof
|
f
8.7
q 0 B, F)
d,
We need only guess where an error occurs. That
w
that
is
— wf+1
is,
w
203
we must verify for
is false.
The set of invalid computations of a Turing machine a CFL.
a string
If
VALID AND INVALID COMPUTATIONS OF TM*S
|
is
M = (Q, E, T,
an invalid computation, then one of the following condi-
tions holds. 1)
w
2)
xt
is
not
initial;
3)
xm
is
not
final; that
is
5)
R
\w
4) Xi
xf
#*m #>
not of the form x 1 #x 2 # *"
is
(*i+i)
1-^-
x
I
+
j
that
xm
is,
is
is
not
not
in
in
some odd
false for
false for
is
xx
is,
some even
where each x
£
an ID of M.
is
g 0 £*-
r*FT*. i.
i.
The set of strings satisfying (1), (2), and (3) is regular, and an FA accepting it is The sets of strings satisfying (4) and (5) are each CFL's. We
easily constructed.
prove
this
contention for
(4);
nondeterministically selects
a similar argument prevails for
some x
f
that
is
(5).
A PDA P for (4)
preceded by an even number of #'s
—
z, with the right and while reading x, stores on its stack the ID z such that x, end of z at the top of the stack. After finding # on the input, P compares z with the following x i+ 1 If z x i+ i, then P scans its remaining input and accepts. The set of invalid computations is the union of two CFL's and a regular set. By Theorem 6.1 it is a CFL, and a grammar for this language can be constructed |
.
effectively.
CFG G whether L(G) = £*. Proof Given an arbitrary TM M, we can effectively construct a CFG G with by terminal alphabet Z, such that L(G) = Z* if and only if L(M) = 0. That Lemma 8.7 we may construct a CFG G that generates the invalid computations of
Theorem
8.11
It is
undecidable for any arbitrary
is,
= E*
were decidable, then we could decide for
M. Thus
if
arbitrary
M whether L(M) = 0, a contradiction.
for arbitrary G,
L(G)
Other consequences of characterization of computations by CFL's
Many
other results follow from
Theorem
8.11.
Theorem 8.12 Let G and G 2 be arbitrary CFG's and R an The following problems are undecidable. x
1)
L(G 1 )
3)
L(G 1 )
Proof
Fix
G Then t
.
= L(G 2 = R. G2
(1)
).
to be a
and
2)
UG 2 )^UG
4)
R^UGJ.
arbitrary regular
set.
l ).
grammar generating Z*, where £ is the terminal alphabet of Ufi ) = £*. Fix R = Z*, and (3) and (4) are
(2) are equivalent to
x
UN DECIDABILITY
204
L(G
equivalent to
t
)
=
X*.
reduces to (1) through
Thus the undecidable problem of whether a CFL is X* and each of these problems is undecidable as well.
(4),
Note that by Theorems 5.3 and 5.4, one can convert effectively between PDA's and CFG's, so Theorems 8.10, 8.11, and 8.12 remain true if CFL's are represented by PDA's instead of CFG's. Also, the regular set R in Theorem 8.12 can be represented by a DFA, NFA, or regular expression as we choose. One should observe also that the question L(G) ^ R is decidable. The reason is that L(G) c R if and only if L(G) n R = 0. But L(G) n R is a CFL, and hence its
emptiness
is
decidable.
There are some additional properties of context-free languages that we can show to be undecidable by observing that if a has valid computations on an infinite set of inputs, its set of valid computations is not, in general, a CFL. However, we first modify each Turing machine in a trivial way by adding two extra states whose sole purpose is to ensure that makes at least two moves in every computation. This can be done without otherwise modifying the computation performed by M. The purpose of the modification is to force each valid computation to contain at least three ID's and thus ensure that the set of valid computations is a CFL if and only if accepts a finite set.
TM
M M
M
Lemma input.
by
M
Let
8.8
The is
a
M be a Turing machine that makes at least three moves on every computations of M and only the set accepted a CFL
set of valid
if
if
is
finite set.
M
M
is If the set accepted by is finite, the set of valid computations of is infinite and the set L of and hence a CFL. Assume the set accepted by valid computations is a CFL. Since accepts an infinite set, there exists a valid computation
Proof
M
finite
M
where the
Mark
w,'s are ID's,
and |
the symbols of
ing both W! and
w3
w2
w2
|
is
greater than the constant n in Ogden's lemma.
as distinguished.
Then we can "pump" w 2 without pump-
thus getting an invalid computation that must be in
,
L We
conclude that the valid computations do not form a CFL.
Theorem 1)
8.13
L(Gj)
is
a
It is
undecidable for arbitrary
CFL;
2)
LfGj n L(G 2 )
CFG's G and G 2 whether x
is
CFL.
a
Proof 1) Given an arbitrary Turing machine M, modify accepted, so that
G a
M makes at least two move
generating the invalid computations. Ulfi)
s
M without changing the
on every
is
a
input. Construct
set
CFG
CFL if and only if M accepts
finite set.
2) Proceed as in
(1),
but construct
CFG's G and G 2 such M.
the set of valid computations of
t
that
L(G
X )
n L(G 2
)
is
8.7
There
greibach's theorem
205
THEOREM
GREIBACH'S
8.7
|
a striking similarity
among
the proofs of undecidability in language an analog of Rice's theorem for classes of languages such as the CFL's, and indeed there is. Let us focus our attention on a class of languages such as the CFL's, and on a particular system (such as CFG's or PDA's) for interpreting finite-length strings as names of languages. Consider a class # of languages with the property that, is
theory. This suggests that there
is
(e.g., grammars) of languages L x and L2 in # and a name (e.g., a finite automaton) for a regular set R, we can effectively construct names for RL U L t R, and Li u L 2 Then we say that the class is effectively closed under concatenation with regular sets and union. Assume furthermore that L = E* is undecidable for the class as is the case for the CFL's. The next theorem shows that a wide variety of problems are undecidable for the class
given names
.
Let ^ be a class of languages that is effecunder concatenation with regular sets and union, and for which "= £*" is undecidable for any sufficiently large fixed £. Let P be any nontrivial property! that is true for all regular sets and that is preserved under /a, where a is a single symbol. (That is, if L has the property P, so does L/a = {w wa is in L}.)
Theorem 8.14 (GreibacKs Theorem) tively closed
|
Then P
is
undecidable for
Let
Proof
L 0 ^ £* be
member of ^
a
ently large so that
Lj
is
= Lq#E* u £*#L. L
{
is
set,
and hence P(L t )
L /#w = L0 1
.
Since
P
is
is
c
false
where I*
L ^ Z*
in
#
is suffici-
construct
# is effectively closed under concatenation Now if L = £*, then L = !*#£*, which is a If L ^ 2*, then there exists w not in L. Hence
in €, since
with regular sets and under union. regular
which P(L 0 )
for
undecidable. For any
is
true.
x
preserved under quotient with a single symbol,
preserved under quotient with the string
it
is
#w, by induction on |w|. Thus P(L X )
must be false, or else P(Lo) would be true, contrary to our assumption. Therefore P(L ) is true if and only if L = X*. Thus " = £*" for (€ reduces to property P for ^, and hence P is undecidable for (£. t
Applications of Greibach's theorem
Theorem
8.14 can be used to show, for example, that
language generated by a asking
asked
if
in
CFG
is
the language generated
Theorem
Theorem 8.15
it
is
undecidable
if
the
Note that this question is different from equal to some particular regular set R, as was
regular. is
8.12.
Let
G
be an arbitrary
CFG.
It is
undecidable whether L(G)
is
regular.
Proof The CFL's are effectively closed under concatenation with regular sets and under union. Let P be the property that L is regular. P is nontrivial for the CFL's, t Technically, a property
member
of P.
is
just a subset of (€.
We
say "L has property
Pn
or "P(L)" to
mean L
is
a
UNDECIDABILITY
206
is
true for all the regular sets,
by Theorem
Note
3.6.
with another regular
and
preserved under quotient with a single symbol
is
that the regular sets are effectively closed under quotient
although Theorem 3.6 does not claim
set,
discussion following that theorem).
8.14,
P
is
this (see the
undecidable for
~
CFL's.
Theorem
show
8.15 allows us to
that the property is
Thus by Theorem
is
that a property
is
undecidable by showing
preserved under quotient with a single symbol. This latter task
often relatively easy as, for example, in proving that inherent ambiguity
is
undecidable.
Lemma 8.9 Let P be the property that a CFL is not inherently ambiguous. Then P is preserved under quotient with a single symbol. Proof
Let
G=
(V, T, P, S)
Ga = where 1) all
Pa
(V
be an unambiguous
u
{[A/a]\A
is
CFG.
in V}, T,
Pa
,
Let [S/a]),
contains
productions of P,
2) [A/a] -+ a 3) [A/a]
We claim
if
A
- a[B/a] that
-> aa if
L(G a )
in P,
is
A -+ aBp
=
is
in P,
L(G)/a and that
and 0±>c.
Ga
is
unambiguous. To see
this, first
show
by an easy induction that 1) [S/a]
^> a
if
and only
2) [S/a] ±> a[A/a]
if
if
S ^>
and only
oca,
if
and
S ±> aA.
That L(G ) = L(G)/a follows immediately. Assume must be two leftmost derivations fl
1)
[S/a]^0^>a*>xand
2)
[S/a]*>y±>a*>x where P
±
Ga
is
ambiguous. Then there
y.
But then in G we have two leftmost derivations of the string xa, a contradiction. Thus G a must be unambiguous. We conclude that unambiguity is preserved under quotient with a single symbol.
Theorem 8.16
Inherent ambiguity for CFL's
is
undecidable.
By Lemma 8.9 it is preserved under show that P is true for all regular sets. That is, every regular set has an unambiguous CFG. (The reader may look ahead to Theorem 9.2 for a construction of an unambiguous CFG from an arbitrary DFA.) Thus by Theorem 8.14, inherent ambiguity for CFL's is undecidable. Proof
By Theorem
4.7,
P
is
quotient with a single symbol.
nontrivial.
It is
easy to
,
8.8
|
INTRODUCTION TO RECURSIVE FUNCTION THEORY
207
INTRODUCTION TO RECURSIVE FUNCTION THEORY
8.8
We
mentioned in Section 7.3 that each Turing machine can be thought of as computing a function from integers to integers, as well as being a language recognizer. For every Turing machine and every k, there is a function f$ (i u f 2 that takes k integers as arguments and produces an integer answer or is i ) k ik fl ,2 undefined for those arguments. If started with 0 10 l ••• 0 on its tape halts with (V on its tape, then we say f$ (i u does not halt with a tape i ) = j. If k
M
,
M
.
.
.
. .
.
M
,
i ) is f$ (i u k Note that the same Turing machine can be thought of as a language recognizer, a computer of a function with one argument, a computer of a different function of two arguments, and so on. If i is an integer code for a TM M, as described in Section 8.3, and k is understood, then we shall often write/ in place of f%K Recall that a function computed by a Turing machine is called a (partial) recursive function. If it happens to be defined for all values of its arguments, then it
consisting of a block of O's with
other cells blank, then
all
undefined.
is
also called a total recursive function.
The
constructions on Turing machines given earlier in this chapter and the
previous one can be expressed as total recursive functions of a single variable.
M
That is, an algorithm A that takes as input the binary code for a TM and produces as output the binary code for another TM M' can be viewed as a function g of one variable. In particular, let i be the integer representing and ;
M
be the integer representing M'. Then
g(i)
= j.
putes g is not A, but rather one that converts and then converts its output to unary.
its
Technically, the
TM
B
that
com-
unary input to binary, simulates
A
The 5mn -theorem
Our
first
theorem, called the Smn -theorem, says that given a partial recursive functwo variables, there is an algorithm one can use to construct from a
tion g(x, y) of
TM
for
g and a value for
Theorem
x,
another
TM
which with input y computes
g(x, y).
Let g(x, y) be a partial recursive function. Then there is a total o of one variable, such that f0(x) (y) = g(x 9 y) for all x and y. That
8.17
recursive function is,
if
9(x
9
g(x)
is
treated as the integer representing
some
TM M „
then
f M x (y) = {
}
y)-
Proof
Let
M
compute
g.
TM M x that
Let
A
be a
TM
that given input x, written in unary,
v, shifts it right and writes 0*1 to its head to the left end and simulates M. The output of A is the unary representation of an integer
constructs a left;
M
x
then returns
when
given input
its
M
M
M
produce #(x, y) when given input equality fa(x) (y)
=
g(x, y) follows.
y.
.
Since fa{x)
is
the function
computed by
M
x,
the
UN DECIDABILITY
208
The
recursion theorem
The second theorem, called function a mapping indices
the recursion theorem, states that every total recursive (integers denoting
Turing machines) of partial recur-
x 0 such Turing machines in for which the modified
sive functions into indices of partial recursive functions has a fixed point
that fxo (y)
= fa(x0 )(y) f° r a H y-
^n
other words,
if
we modify
M
all
some manner, there is always some Turing machine XQ Turing machine a(XQ) computes the same function as the unmodified Turing machine. At first this sounds impossible, since we can modify each Turing machine to add 1 to the originally computed function. One is tempted to say that f(y)+ 1 f(y). But note that if f(y) is everywhere undefined, then f(y)+ 1 does
M
equal / (y) for
Theorem
all y.
For any
8.18
/*o(*)=/»(*o)(*) for
a11
a there
total recursive function
exists
an x 0 such that
*
For each integer i construct a TM that on input x computes jfj(i) and then by means of a universal TM, the/)(f)th TM on x. Let g(i) be index of the so constructed. Thus for all i and x,
Proof
simulates,
TM
(8-3)
/,(.•)(*)=//.(.•)(*)•
Observe that
g(i) is
a total function even
g(i)
if/j(i) is
an integer code and then applies a to g(i). Then for x 0
the function ag. That is,;
is
M
fxo(x)=f
= L( 9 u))( x = f*(X0)( X
by )
M M l9
2
,
...
is
a.
That
is,
TM x 0 and TM a(x 0
)
compute
Smn theorems
be any enumeration of all Turing machines.
but only that whatever representation
is
used for a
TM, we can by
convert from that representation to the 7-tuple notation introduced
Then we can use the recursion theorem both compute the same function.
vice versa.
M i+l
computes
the function ag
not require that this enumeration be the "standard" one introduced
and and
i,
we have
)-
Applications of the recursion and
Let
g(j)
(8.3)
since ^.
Thus x 0 is a fixed point of the mapping the same function.
8.9
=
not defined. Let j be an index of
TM that, given input
(x)
= fW) (x)
Example
for a
to
show
in
We do
Section
in
Section
7.2,
some
M,-
that for
Let a(i) be the total recursive function defined as follows. Enumerate
Mu M the
2,
TM
.
.
.
until
one with integer code i as
must be considered
in all possible
in (8.2) is found.
Note
orders to see
is
if
i
8.3,
an algorithm
i,
TM's
that the states of
a code for this
TM,
8.9
ORACLE COMPUTATIONS
|
since in the notation introduced in Section 8.3, the order in
the various states
is
which the moves for has code i, j
M Mi+1
written affects the code. Having found that
M
enumerate one more TM, be the code j+l and let recursion theorem applied to this a says there is some x 0 XQ+l define the same function of one variable.
for
,
for
Then
.
M
which
M
Example 8.10
Given a formal system F, such as
Turing machine
M
such that there
particular input halts,
no proof
is
and no proof that
computing a two-input function 1
209
if
there
is
a proof
that the ith
F
does not
that
halt.
we can
M
exhibit a
started
Construct
the
and
on any
M a TM ,
such that
g(i, j),
not defined; that
it
set theory,
in
xo
F
in
that f (j)
is
t
there
is,
is
a proof
TM does not halt
when given input j;
undefined otherwise.
M enumerates proofs in F in some order, printing not halt on input j
is
1 if
M halts, and M does not halt
a proof that the
ith
TM does
M
we may construct so that if g(U j) = 1, then otherwise. By the Smn -theorem there exists a such
found. Further,
that fad)(j)
By
the recursion theorem,
we may
=
9(iJ)-
effectively construct
fioU)=L(io)U)
=
an integer
i
0
such that
G(hJ)-
=
1, and is therefore defined, if and only if there is a proof in F thai fio (j) Thus if F is consistent (i.e., there cannot be proofs of a statement and its negation), there can be no proof in F that the i th TM either halts or does not 0 halt on any particular input j.
But g{i 0 J) is
undefined.
8.9
ORACLE COMPUTATIONS
One
is
tempted to ask what would happen
if
some other we then compute everything? To we start out by assuming that the
the emptiness problem, or
undecidable problem, were decidable? Could
answer the question we must be careful. If is decidable, we have a contradictory
emptiness problem
may conclude anything. We avoid
this
set of assumptions and problem by defining a Turing machine with
oracle.
Let A be a language, y4gI*.A Turing machine with oracle A is a single-tape Turing machine with three special states q h q y and q n The state q., is used to ask whether a string is in the set A. When the Turing machine enters state q., it requests an answer to the question: "Is the string of nonblank symbols to the right of the tape head in The answer is supplied by having the state of the Turing ,
AT
.
UNDECIDABILITY
210
machine change on the next move to one of the two states q y or q n depending on is yes or no.f The computation continues normally until the ,
whether the answer
q-> is entered, when the "oracle" answers another question. Observe that if A is a recursive set, then the oracle can be simulated by another Turing machine, and the set accepted by the with oracle A is recursively enumerable. On the other hand, if A is not a recursive set and an oracle is
next time
TM
available to supply the correct answer, then the that
is
TM with oracle A may accept a set
We denote the Turing machine M with
not recursively enumerable.
oracle
A by M A A set L is recursively enumerable with respect to AifL= L(M A ) for some TM M. A set L is recursive with respect to A if L = L(M A ) for some TM M A that .
always
A
halts.
Two
oracle sets are equivalent
if
each
is
recursive in the other.
hierarchy of undecidable problems
We
now
can
rephrase the question at the beginning of the section as
"What
can be recognized given an oracle for the emptiness problem?" Clearly not can be
r.e.
with respect to the emptiness problem, since there
is
sets
all sets
an uncountable
number of sets and only a countable number of TM's. Consider the oracle set S = {(M)\L(M) = 0}, which is not an r.e. set (recall that (M> is the binary code for TM M). Now consider TM's with oracle S v These machines have a halting problem that is not recursive in Sj. By defining an oracle for the emptiness problems for TM's with oracle S,, and so on, we can develop an infinite hierarchy x
of undecidable problems.
More S i+l
S i+
is
!
their equivalence to
a
set S, for
The membership problem
Theorem 8.19 to S
some
for
particular
TM's without
i.
oracles
is
equivalent
.
Construct
Proof
M'
M
that accepts
construction of
code
0}.
an oracle for solving the emptiness problem for computations with respect
by showing
TM
= {(M)\Ls>(M) =
We can now classify some undecidable problems (but not all such problems)
to 5,.
t
specifically, define
M'
for
Si
that, given
0
M' was
if
w
is
given in
to the right of
its
(M, w) on its input, constructs the code for a in L(M) and accepts (0 + 1)* otherwise. The Sl then enters state q with the Example 8.2.
not
M
head and accepts
if
,
and only
if
q
n is
entered.
Thus
TM's without oracle is recursive in S Conversely, we can show there is a Turing machine with the membership problem as oracle, that recognizes S v (Strictly speaking, the oracle is L^.) To show M' S is recursive in L^, construct a TM 2 that, given (M>, constructs a new TM operating as follows: M' ignores its input; instead, uses the pair generator to generate all pairs (/, for steps on the When (i, j) is generated, M' simulates the
membership problem
for
t
M
x
t
Note
that the
entering q
>.
TM
can remember
its
prior state
M
.
'
by writing that
M
state
on
i
its
tape just before
8.9
yth input
M'
word
accepts
accepts
its
all its
Af' accepts
that
£,
t
If e
= 0,
L(Af )
is
M
Thus
2
211
usual ordering. If Af accepts,
in the
then L(M')
in particular.
whether (Af, e)
is,
M. Thus 5
accepts
words being numbered
to Af ,
own input. own inputs,
ORACLE COMPUTATIONS
|
= 0. may
"
If
L(Af )
query
Af 2 rejects
in Z^. If so,
its
+ 0,
then Af'
oracle whether
M. Otherwise Af 2
recursive in L^.
is
Next consider the problem whether L(M) = Z*, where £ is the input alphabet for TM Af In a sense, this problem is "harder" than membership or emptiness, because, as we shall see, the " = £*" problem is equivalent to S 2 while emptiness and membership are equivalent to S While this difference means nothing in practical terms since all these problems are undecidable, the results on compara.
,
x
.
tive degree of difficulty suggest that when we consider restricted versions of the problems, the " = £*" problem really is harder than membership or emptiness. For
grammars, the emptiness and membership problems are decidable, 8.11 the problem whether L(G) = E* is undecidable. For another example, consider regular expressions. The emptiness and membership problems are each decidable efficiently, in time polynomial in the length of the expression, while the problem whether a given regular expression r is equivalent to I* has been proved almost certainly to require time exponential in the length context-free
Theorem
while by
ofr.t
Theorem 8.20
= Z*
The problem whether L(M)
TM
equivalent to S 2
is
.
TM M
and constructs M% that takes an arbitrary Proof We construct a s Sl from it enumerates words with oracle S that behaves as follows. \ a x, and for each x uses oracle S accepts x. The technique to tell whether whereby S can be used to answer the membership question was covered in Sy Theorem 8.19. accepts its own input if any x is not accepted by M. Thus
M
2
TM
t
M
x
x
M
otherwise.
|Z*
Ml
2
with input
L(M Sl ) = 0.
M
If so,
M
M\
2
{|L(M) = Z*}. Now we must show
M
constructs
s
\% then asks
accepts Af, and
that
S2
is
M\
2
its
own
oracle,
rejects otherwise.
Thus
5 2 whether ,
M%
recursive in the " = £*" problem. That
be the set of codes for ordinary Turing machines that accept
TM
all strings
2
accepts
is, let
over their
Mj* that accepts S 2 Then there is a s Af Before constructing Mi*, we first define a valid computation of a using oracle S v A valid computation is a sequence of ID's, just as for ordinary input alphabet.
.
TM
Turing machines. However, if one ID has state Sl then has queried the oracle whether some
M
Note
which
problem
is
"complete
in
polynomial space"; see Chapter
5 is not part of M. Actually M\ z constructs the work correctly given S as oracle.
that will
and the next ID has
state q n ,
TM N accepts 0 and received the
t Technically, the t
q,,
1
X
t
13.
state transitions of oracle
machine M,
UNDECIDABILITY
212
To demonstrate
that this answer is correct, we insert a valid compuaccepts some particular input. If the N, showing that next state is q y however, we insert no computation of N. Now, let us describe how M4* behaves on input Sl Mj* creates an ordinary TM M' that accepts all the invalid computations of Sl To check that a string is not a valid computation, M' checks if the format is invalid (as in Lemma 8.7), or if Sx Sl one ID of does not follow on one move from the previous ID of in the Sl inserted between ID's of sequence, or if a computation of an ordinary
answer "no."
tation of ordinary
TM
N
,
M M
.
.
M
M
M
TM N
with states
and q n
q.,
The only
is
difficult
not valid.
part to check
is
when one ID
has state q r Then M' must determine these two ID's do not follow in sequence. Let
next
ID
if
"yes"
of is
M
Sx
has state
q>,
and the
not the correct answer, so about which the query
TM
N be the made. M' uses the pair generator and, when (i, j) is generated, simulates N for steps on the ;th input. If N accepts, M' determines that L(N) ± 0> so "yes" is the wrong answer. Thus the computation is not a valid one, and M' accepts this computation. Now M' accepts all strings over its input alphabet if and only if L(M Sl ) = 0, Sl that is, has no valid computations. Mj* may query its oracle whether M' Sl accepts Z*. The code for is in S 2 if and only if L(M') = Z*. Thus S 2 is is
i
M
M
recursive in L^.
Turing reducibility
We
have, throughout this chapter, dealt with a notion called "reducibility," in which we reduced language L to L 2 by finding an algorithm that mapped strings in L to strings in L 2 and strings not in L 1 to strings not in L 2 This notion of reducibility is often called many-one reducibility, and while it was all we needed, it is not the most general notion. A more general technique is called Turing reducibill
.
Y
ity,
and consists simply of showing that
L
is
x
L2
recursive in
.
many-one reducible to L 2 then surely L, is Turing-reducible to L 2 In proof, suppose / is a function computable by a TM that always halts, such that 2 that, given f(x) is in L 2 if and only if x is in L,. Then consider the oracle TM input x, computes f(x) and then enters state q, with f(x) to the right of its head. Ll accepts if and only if it then enters q y Surely L(M 2 ) = L l9 so L Turingreduces to L 2 The converse is false, and a proof is suggested in the exercises. If Lj Turing-reduces to L 2 and L is undecidable, then so is L 2 For if L 2 were If
Lj
is
.
,
M
M
7
l
.
x
.
.
,
recursive, then the oracle
ordinary that
L2
TM is
x
TM M Ll such that L(M L2 =
can be simulated by an
)
that always halts.
Thus one could use a Turing reduction
to
show
undecidable, given that Lj was undecidable, even in circumstances
where a many-one reduction of
The notion of many-one one reducible to L 2 and L ,
x
L
x
to
L2
did not
reducibility has is
not
r.e.,
its
exist,
or was hard to find.
virtues,
however.
we can conclude L2
is
If
not
L
y
r.e.
is
many-
Yet
this
EXERCISES
213
conclusion cannot be drawn for Turing reducibility. For example, Z^is a non-r.e. language that Turing-reduces to the r.e. language L M We can recognize L M given L M as an oracle, by asking whether (M, w) is in and accepting if and only if the .
answer
is
We
no.
more difficult form of reducibility (many-one) enables us to draw conclusions we cannot draw with the easier form of reducibility (Turing). In Chapter 13, where we study bounded reducibility, we shall see additional examples of how more difficult forms of reductions yield conclusions not achievsee that the
able by easier forms.
EXERCISES Suppose the tape alphabets of all Turing machines are selected from some ... Show how each TM may be encoded as a binary string.
8.1
of symbols a u a 2
Which
8.2 a)
b) c)
d)
S
infinite set
,
of the following properties of
r.e.
sets are
themselves
r.e.?
L contains at least two strings. L is infinite. L is a context-free language. L = L*.
8.3
Show
8.4
A
and a
start string
that
it
is
undecidable whether a
Post Tag System y.
We
is
a
finite set
say that
=>
P Sp
TM
halts
of pairs if (a,
(a,
p)
is
on
all
inputs.
p) chosen from some finite alphabet, a pair. Define ^> to be the reflexive,
grammars. Show that for given tag system (P, y) and string <5 with it is undecidable whether y^>3. [Hint: For each TM let y be the initial ID of blank tape, followed by a marker #, and select the pairs so that any ID must become the next ID after a sequence of applications of the rules, unless that ID has an accepting state, transitive closure of =>, as for
M
M
in
which case the ID can eventually become
Show
8.5
/
function
that there
is
i.
Then ask
no algorithm which given a
of one variable, produces a
TM
M'
if
y ^>
TM M
e.]
defining a partial recursive
that defines a different function of
one
variable.
For ordinary Turing machines M, show
**8.6
a) the
b) the
problem of determining whether L(M) problem of determining whether L(M)
Show
8.7
that the following
that is finite is is
equivalent to S 2
a regular
problems about programs
set is
\
equivalent to S 3
in a real
.
programming language
are undecidable. a)
b) c)
Whether a given program can loop forever on some Whether a given program ever produces an output. Whether two programs produce the same output on
Use Theorem
8.8
a)
b)
L I
8.14 to
is
a linear language.
is
a
CFL.
show
input.
all
inputs.
that the following properties of
CFL's are undecidable.
/
/
UN DECIDABILITY
214
Show that Theorem
*S 8.9 for a
"
= E*,"
L L L L
a)
c)
d)
is
a regular
is
a linear language.
is
a
is
9.2
showing that
You may
use
a linear language.
set.
CFL.
has no unambiguous linear
Show
*8.11
is
that the following properties of linear languages are undecidable.
the fact that every regular set
b)
has a linear grammar. The hard part
set
undecidable for linear languages.
is
Show
* 8.10
Theorem
8.14 applies to the linear languages. [Hint: Consult
proof that every regular
CFL
that for
L,
it is
CFG.
undecidable whether
L=
L*.
*8.12
Show that if L many-one reduces to L 2 and L 2 is (i) recursive in L 3 then L x is recursive or r.e. in L3 respectively. Show that L u Turing-reduces to S t Show that L u does not many-one reduce to S v [Hint: Use part (a).]
a)
,
x
or
(ii) r.e.
in
L3
,
,
b)
.
c)
We
8.13
L
say that
L2
"truth-table" reduces to
x
if:
There are k algorithms mapping any string x over the alphabet of L to strings over the alphabet L 2 Let g^x) be the result of applying the ith algorithm to x.
1)
x
.
2) There is
is
a Boolean function
false otherwise,
L
then f(y x
(y ly
. .
.
,
yk ) such that
is
y*)
,
true
if
if y, is
true
an d only if*
when
is
in
L
#,(x)
is in
L 2 and y ,
f
.
x
l's, and let L 2 be the x and # 2 (x) be formed from x by replacing 0's by l's and vice versa. Let f(y u y 2 ) = y a y 2 Then [y u y 2 ) is true if and only if #i(x) and g 2 (x) both have no fewer 0's than l's; that is, x has an equal number of 0's and l's. Thus L truth-table reduces to L 2
For example,
set
let
be the
x
set of strings
of strings with no fewer 0's than
with equal numbers of 0's and
l's.
Let g (x) x
=
.
i
.
x
Show Show Show
a)
b) c)
8.14
that
that that
L truth-table reduces to L 2 if L many-one reduces to L 2 L u truth-table reduces to S if
x
,
x
x
Consider a multitape
TM
TM
then then
L L
x
x
Turing-reduces to
L2
truth-table reduces to
last.
when
Show
it
queries
that this
its
is
equivalent to the
Show
that
PCP
is
decidable for words over a one-symbol alphabet.
Show
that
PCP
is
equivalent to S,.
*8.17
Show
that
PCP
*8.18
is
undecidable
strings are restricted to
strings are restricted to
have length one or two.
Let o be a total recursive function mapping indices of partial recursive functions to
points of a; that
no
if
have length exactly two?
indices of partial recursive functions.
*8.19
.
as defined in Section 8.9.
8.15
if
L2
oracle, refers to the
model
8.16
What
.
.
with oracle which,
entire contents of a designated tape, say the
oracle
,
Does
is,
infinitely
many
Give an algorithm to enumerate an
fs such that/(y)
=f
a{i) (y)
there exist an effective enumeration of Turing machines
three consecutive
TM's compute
the
same function?
infinite set of fixed
for all y.
MU M
2 , ...
such that
EXERCISES
215
Solutions to Selected Exercises
Let
8.3
M=
(£),
Z, T, S, q 0y
B
y
F) be a
TM. We
TM M\ such that M'
construct another
M
and only if accepts x. We shall thus have shown that the question whether a TM halts on all inputs reduces to the question whether a TM accepts all inputs, which we know is undecidable. Incidentally, we shall also show by this construction that a question such as "Does a TM halt on a given input ?" or "Does a TM halt on some input?" is also halts
on x
if
undecidable.
M' is designed to behave as follows. First, it shifts its input one position right, placing a end marker $ on the leftmost cell. M' then simulates M. If 5(q X) is undefined, and either (i) q is nonaccepting and X is any symbol in {$} [note that S(q $) is surely undefined], or (ii) q is accepting and X is $, then M' scanning X in state q moves right and enters state p In state p u scanning any symbol, M' moves left and enters state p 2 in that either halts in a state, M' moves right and enters p x again. Thus M' loops forever if enters an accepting nonaccepting state or falls off the left end of the tape in any state. If accepts its input, as state, not scanning $, then M' halts. Thus M' halts if and only if left
y
Tu
x
y
.
,
M
M M
desired.
We
8.9
must
first
show
that the linear languages are closed under union
We
tion with regular sets.
generated by
CFG
nonterminals
A and B and
whose productions are of
of
all
and concatena-
look ahead to Theorem 9.2 for a proof that every regular set
A
the forms
-»
Bw
and A
-+
w
is
for
Any such grammar is surely linear. The is just like Theorem 6.1. For concatenaG = (Vu Tiy P ly S ) be a linear grammar and G 2 = (V^ T2y Pi, productions of the forms A -+ Bw and A -> w. Assume V and V2 string of terminals w.
proof that linear languages are closed under union tion with a regular set, let
S 2 ) be a grammar with
all
t
t
x
are disjoint. Let
G=
u V2y
(Vt
T,
u T2
,
P, S 2 ),
where P consists of i) ii)
iii)
all
productions
production all
A
A
Bw of P 2l w whenever A
->
-> S,
productions of
P
{
->
w
a production of
is
P 2 and >
.
Then L(G) is easily seen to be L(Gi)L(G 2 ) since all derivations in G are of the form => y. Since regular sets and linear languages are closed S2 Si x => yx, where S 2 =>x and under reversal, concatenation on the left by a regular set follows similarly. Now we must show that " = £*" is undecidable for linear languages. The proof closely y
Lemma
and Theorem 8.11, the analogous results for general CFG's. The that we must redefine the form of valid computations so the set of to be a computations is a linear CFG. Let us define a valid computation of TM
parallels
8.7
important difference invalid
is
M
string
w^w
2
^
••
#w n _ #wn ##w;#wj_ # 1
1
•••
#wf#wf,
(8.4)
where each w, is an ID, w, w i+ for 1 < / < n, w is an initial ID and w„ is a final ID. it is not hard to construct a linear grammar for strings not of the form (8.4), paralleling the ideas of Lemma 8.7. Then the analog of Theorem 8.11 shows that " = E*" is {
Then
undecidable for linear grammars.
x
216
UNDECIDABILITY
BIBLIOGRAPHIC NOTES The
undecidability of
terizing recursive
and
is r.e.
the basic result of Turing [1936].
index
problem was shown undecidable patterned after Floyd [1964].
sets,
in
Theorems
8.6
and
8.7,
charac-
are from Rice [1953, 1956]. Post's correspondence
Post [1946], and the proof of undecidability used here is 8.6 and 8.7, relating Turing machine computations
Lemmas
Hartmanis [1967]. The fundamental papers on undecidable properties of CFL's are Bar Hillel, Perles, and Shamir [1961] and Ginsburg and Rose [1963a]. However, Theorem 8.9, on ambiguity, was proved independently by Cantor [1962], Floyd [1962a], and Chomsky and Schutzenberger [1963]. Theorem 8.16, undecidability of inherent ambiguity, is taken from Ginsburg and Ullian [1966]. Linear grammars and their decision properties have been studied by Greibach [1963, 1966] and Gross [1964]. The approach used in the solution to Exercise 8.9 is from Baker and Book [1974]. Greibach's theorem is from Greibach [1968]. A generalization appears in Hunt and Rosenkrantz [1974], which includes a solution to Exercise 8.11. Hopcroft and Ullman [1968a] shows that for certain classes of languages defined by automata, the decidability of membership and emptiness are related. The S mn - and recursion theorems are from Kleene [1952]. Example 8.10, on the nonexistence of proofs of halting or nonhalting for all TM's, is from Hartmanis and Hopcroft [1976]. Hartmanis and Hopcroft [1968] are the authors of the basic paper relating problems about CFL's to the hierarchy of undecidable problems. Theorems 8.19 and 8.20, as well as Exercise 8.6, are from there. Additional results of this nature have been obtained by Cudia and Singletary [1968], Cudia [1970], Hartmanis [1969], and Reedy and Savitch [1975]. Exercise 8.4, on tag systems, is from Minsky [1961]. to CFG's, are from
CHAPTER
9
THE CHOMSKY HIERARCHY
Of
the three
major
classes of languages
context-free languages,
we have
studied
— the regular the — we have gramsets,
and the recursively enumerable languages
we shall give grammatical and the r.e. languages. We shall also introduce a new class of languages, lying between the CFL's and the r.e. languages, giving both machine and grammatical characterizations for this new class. The four classes of matically characterized only the CFL's. In this chapter
definitions of the regular sets
languages are often called the Chomsky hierarchy, after
Noam Chomsky, who
defined these classes as potential models of natural languages.
REGULAR GRAMMARS
9.1
If all
productions of a
variables is
and
w
is
right-linear. If all
left-linear.
Example
A
9.1
CFG are of the form A -> wB or A
-> w,
a (possibly empty) string of terminals, then
productions are of the form
right- or left-linear
grammar
The language 0(10)*
is
is
A
->
Bw
or
called a regular
where A and
we say
A
the
-> w,
B are
grammar we call it
grammar.
generated by the right-linear
grammar
S-+0A (9.1)
A->10A\( and by the
left-linear
grammar 5->510|0
217
(9.2)
THE CHOMSKY HIERARCHY
218
Equivalence of regular grammars and
The
automata
finite
grammars characterize the regular sets, in the sense that a language is and only if it has a left-linear grammar and if and only if it has a right-linear grammar. These results are proved in the next two theorems. regular
regular
if
Theorem Proof
9.1
If
L
has a regular grammar, then
suppose
First,
L=
L
is
a regular
Ufi) for some right-linear grammar
We construct an NFA with 6-moves, M =
(£),
set.
G=
(K, T, P, S).
T, S, [S], {[c]}) that simulates deriva-
tions in G.
Q
consists of the symbols [a] such that a
some right-hand
of
suffix
We 1) If
A
2) If
a
a variable, then S([A],
is
is in
T and
a in T*
u
Then an easy induction on w) contains
<5([S],
tion
and xy
least
=
only
one
Now
w, or
if
if
a
=
5 ^> xA =>
e)
=
if
|
A
-> a
6.
(V, T, P, 5) be a left-linear
F = {A -> a
|
G
is
if
G
A
a
a right-linear
K
grammar. Let G'
is in
at
generates w. Hence every
=
(V, 7,
with right sides reversed, that
reverse the productions of a left-linear
mar, and vice versa. Thus G'
{[a]}.
M
M
G=
=
But since every derivation of a terminal string has
we see that accepts w if and only grammar generates a regular set.
let
a production}.
is
<5([aa], a)
move sequence shows 5 ^> xA => x_ya, where A -*> ya. is a producaccepts As [e] is the unique final state,
where P' consists of the productions of
we
necessarily proper)
the length of a derivation or
and only S and w =
w.
{[a]
T*K, then
[a] if
step,
right-linear
If
S or a (not
define 3 by:
that
w if and
is
side of a production in P.
F,
5),
is,
P}.
grammar we
get a right-linear gram-
grammar, and
it is
easy to
show
that
R
By the preceding paragraph, Ufi') is a regular set. But the regular Ufi') = I-(G) R sets are closed under reversal (Exercise 3.4g), so Ufi') = Ufi) is also a regular set. Thus every right- or left-linear grammar defines a regular set. .
Example 9.2 The shown in Fig. 9.1.
NFA
constructed by
Fig. 9.1
NFA
Theorem
9.1
accepting 0(10)*.
from grammar
(9.1) is
9.1
Now
consider
grammar
(9.2). If
we
reverse
|
its
REGULAR GRAMMARS
219
productions we get
S-01S|0 The construction of Theorem we reverse the edges of that another
NFA
9.1 for this
NFA
grammar yields
and exchange
the
initial
NFA of Fig. 9.2(a). If
and
final states,
we
get
for 0(10)*.
l
(b)
Construction of an
Fig. 9.2
Theorem 9.2 and by some
If
NFA
from a
for 0(10)*
left-linear
grammar.
L is a regular set, then L is generated by some left-linear grammar
right-linear
grammar.
M
= (Q, E, 3, q 0 F). Fin^sjuppose that qQ is not a L = L(M) for DFA Then L = L(G) for right-linear grammar G = (Q, Z, P, q 0 ), where P consists of production p-+ aq whenever <5(p, a) = q and also p -> a whenever <5(p, a) is a final state. Then clearly <5(p, w) = q if and only if p ^> wq. If wa is Proof
Let
,
final state.
accepted by
M,
let
S(q 0
,
w)
=
p,
implying q 0
=*>
wp. Also,
<5(p,
a)
is final,
so p
-a
is
Thus q 0 ^> wa. Conversely, let q0 ^> x. Then x = wa, and % ^>wp=>wa for some state (variable) p. Then S(q0y w) = p, and <5(p, a) is final. Thus x is in L(M). Hence L(M) = L(G) = L. Now let g 0 be in F, so £ is in L. We note that the grammar G defined above generates L — {e}. We may modify G by adding a new start symbol 5 with productions 5 - q 0 e. The resulting grammar is still right-linear and generates L. To produce a left-linear grammar for L, start with an NFA for LK and then reverse the right sides of all productions of the resulting right-linear grammar. a
production.
|
THE CHOMSKY HIERARCHY
220
Example 9.3 In Fig. 9.3 we see a DFA The right-linear grammar from this
for 0(10)*.
DFA
is
A-+0B\W\0
B-0D|1C C->0£|1D|0
D-0D|1D As
D
is
useless
we may
eliminate
it,
obtaining
grammar
>4->0B|0
B -» 1C C->0£|0
o,
DFA
Fig. 9.3
9.2
UNRESTRICTED GRAMMARS
The
largest family of
the form a ->
#
/?,
grammars
where a and
in the
ft
1
for 0(10)*.
Chomsky
hierarchy permits productions of
are arbitrary strings of
grammar symbols, with
These grammars are known as semi-Thue, type 0, phrase structure or unrestricted grammars. We shall continue to use the 4-tuple notation G = (K, 7, P, S) for unrestricted grammars. We say ycxS => yfid whenever a - ft is a production. As before, stands for the reflexive and transitive closure of the relation =>: a
c.
L(G)
= {w|w
is
in
T* and S^>w},
exactly as for context-free grammars.
Example
9.4
\)S-+ACaB 2) 3)
4)
Ca
->
aaC
CB-+DB C£-*£
A grammar
generating {a
5)aD->Da 6) 7) 8)
AD ->
,4C
a£->£a AE-+e
1 1
ns a
positive
power of 2}
is
given below.
9.2
|
UNRESTRICTED GRAMMARS
221
A and B serve as left and right endmarkers for sentential forms; C is a marker that moves through the string of a's between A and B, doubling their number by production (2). When C hits the right endmarker B, it becomes a D or E by (3) or (4). If a D is chosen, that D migrates left by production (5) until endmarker A is reached. At that point the D becomes a C again by production (6), and the process starts over. If an E is chosen, the right endmarker is consumed. The E migrates left by production (7) and consumes the left endmarker, leaving a string of T as for some i > 0. We can prove by induction on the number of steps in the derivation that if production (4) is never used, then any sentential form is either
production the
left
S,
i) ii)
of the form
Aa Ca}B, where
i
iii)
of the form
Aa DajB, where
i
l
l
+ 2j is a positive power of 2, + j is a positive power of 2.
or
When we
use production (4) we are left with a sentential form AdE, where i is a power of 2. Then the only possible steps in a derivation are i applications of (7) to yield AEd followed by one application of (8), producing sentence a\ where i is a positive power of 2. positive
Equivalence of type 0 grammars and Turing machines
We shall prove in N
two theorems that unrestricted grammars characterize theorem states that every type-0 language generates an r.e. set. An easy proof would be to give an algorithm for enumerating all strings generated by a type-0 grammar. Instead we construct a Turing machine recognizer for sentences generated by a type-0 grammar, since this construction will be the
r.e.
languages.
the next
The
first
useful later for a similar proof about context-sensitive class in the
Theorem r.e.
Chomsky
9.3
If
L is
grammars
(the remaining
hierarchy).
L(G)
for unrestricted
grammar G =
(V, T, P, S), then
L is an
language.
Proof
Let us construct a nondeterministic two-tape Turing machine
nize L.
M's
tape
is
first
tape
is
the input, on which a string
used to hold a sentential form a of G.
w
M
M to recogM
be placed. The second initializes a to S. Then will
repeatedly does the following: 1)
Nondeterministically select a position
can be chosen. That
is,
start at the
left,
|a i in a, so that any i between 1 and and repeatedly choose to move right or |
select the present position.
2) Nondeterministically select a production
ft
-» y of G.
of
a,
replace
3) If
p appears beginning
in position
i
P by
"shifting-over" technique of Section 7.4, perhaps shifting
y there, using the left if \y\
<
\P\.
THE CHOMSKY HIERARCHY
222
4)
Compare
w
L
form with
w on tape
(4)
1.
If they
match, accept;
(1).
and only the sentential forms of G appear on tape 2 some sequence of choices. Thus L(M) = Ufi) = L,
that all
executed after
is
is r.e.
Theorem
mar
show
easy to
It is
when Step so
the resulting sentential
a sentence of G. If not, go back to Step
is
9.4
L
If
an
is
r.e.
language, then
L=
L(G)
for
some
unrestricted gram-
G.
Proof
L be
Let
accepted by Turing machine
M = (Q, Z, T,
5,
q 0 B, F). Construct ,
grammar G that "nondeterministically" generates two copies of a representation of some word in Z* and then simulates the action of on one copy. If accepts
a
M
G
the word, then
M M does not
converts the second copy to a terminal string. If
accept, the derivation never results in a terminal string.
Formally,
G=
let
and the productions 1)
A -+q 0 A 2
2)
A 2 -*
in
where
\
x
P
K=((Eu
{e})
x T)
u
{A l9
A 2 A 3} ,
are:
l
a]A 2
[a,
3)
A 2 -+A 3
4)
/t 3
-+[e, B]/l 3
5)
A3
-+c
X]
6)
that 7)
A
(K, Z, P,
[fe,
each a
-> [a, y]p for
Z]g[a,
X]
(p, y,
in Z.
ainZ u
each
{e}
-> p[>, Z][a, y] for each
- qaq, q[a,
in
Q and X and y in T, such
and
Z
in T,
and each
K).
g, such that S(q,
in
8) [a, X]^f
=
X)
<5fo,
and g
for
X]
X) =
-> qaq,
X,
Y,
a and
ftinZu
{t},
(p, y, L).
and q
->
£
for each
ainlu
{£},
X
in T,
and g
in F.
Using
rules
where a
x
some m, rule 4
m
1
and
2,
we have
Then for Z for each Suppose that M accepts the string a a M uses no more than m cells to the right of input. Using rule then is
in
i.
1
2
• • •
a„.
its
times,
and
finally rule 5,
*x *>q 0 [ai, flj[a 2 flj .
From
3,
we have B]
k,
m .
only rules 6 and 7 can be used until an accepting state is generated. Note that the first components of variables in (Z u {e}) x T are never changed.
this point on,
We
can show by induction on the number of moves <7ofli <*2
an
\w
x x2 \
"'
X r-iqX
••• r
X
s,
made by
M that
if
(9.3)
9.3
CONTEXT-SENSITIVE LANGUAGES
|
223
then •••
qo[a l9 a x ][a 2 , a 2 ]
X
[a u
where a u a 2 are in T, and
[an , an][e,
X2
x ][a 2 ,
]
£f§> [ar _
• • •
X _ J^, XJ r
l9
= an+2 = ••• = = X n+m = B.
a„ are in E, a n+1
,
Xs+1
= X s+2 =
•••
• • •
0 n+m
X n+m
[a n+m ,
=
e,
X u X2
The inductive hypothesis is trivially true for zero moves, since r it is true for k — 1 moves and let
=
(9.4)
],
Xn + m
,
1
and
s
=
n.
Suppose
\^-X X 2 -X
30*1*2 " 'an
By
1
r
.
1
where the If
t
*
'
a's
=r+
rule (6), q[a ri
4oK *i]
K
BJ"
and X's 1,
X
••*[*„,
•••
[ar -
satisfy the conditions
then the
r]
§>K
-> [a r ,
move
/cth
7r]p
is
of
M
^
*
2
t
is
X
u
of
r
^]q[ari
X
~
r]
[an+m ,
X n+m
],
(9.4).
to the right, so 5(q,
X = r)
(p,
Yr
,
R).
By
a production of G. Thus
m *n][^B] |>
K yj - k_
l5
q - k +m
y,_
,
yn+m ],
(9.5)
= £ for > u. = r — 1, then the /cth move of M is to the left, and we prove (9.5) using rule and the observations that r > 1 and S(q X = (p, Y L). By rule (8), if p is in F then
where
If
Y;
i
r
r)
y
[a u
We
T77 -
the inductive hypothesis,
*o[«i. «i]
(7)
qX r - X,
yj
•
•
[a t _
u y.-JpK
have thus shown that
if
w
is
in
q
•••
[a n+m ,
L(M), then
/Ij
r,
yn+m ]^>flifl 2 ^> w, so
w
is
—a
n.
in L(G).
For the converse, that w in Jjfi) implies w in L(M), an induction similar to the above shows that (9.4) implies (9.3). We leave this part as an exercise. Then we note that there is no way to remove the state of from sentential forms of G without using rule (8). Thus G cannot derive a terminal string without simulating an accepting computation of M. By rule (8), the string derived must be the first components of the variables in (Z u {e}) x T, which are never changed as moves
M
of
9.3
M are simulated. CONTEXT-SENSITIVE LANGUAGES
Suppose we place the restriction on productions a -> ft of a phrase structure that jS be at least as long as a. Then we call the resulting grammar context-sensitive and its language a context-sensitive language (CSG and CSL
grammar
THE CHOMSKY HIERARCHY
224
respectively). The term "context-sensitive" comes from a normal form for these grammars, where each production is of the form a 1 Aa 2 -> ai/fa 2 with p ^ e. Productions of the latter form look almost like context-free productions, but they permit replacement of variable A by string /? only in the "context" a x — cc 2 We leave this normal form as an exercise. Almost any language one can think of is context-sensitive; the only known proofs that certain languages are not CSL's are ultimately based on diagonalization. These include L u of Chapter 8 and the languages to which we may reduce L„, for example, the languages proved undecidable in Chapter 8. We shall prove in Section 9.4 that there are recursive languages that are non-CSL's, and in Chapter 12 we shall refine this statement somewhat. In both cases the proofs proceed by ,
.
diagonalization.
Example
9.5
Consider again the grammar of Example
9.4.
There are two pro-
ductions that violate the definition of a context-sensitive grammar. These are
CB-> E and AE -»
We can create a CSG
21
i > 1} by realizand E are nothing but markers, which eventually disappear. Instead of using separate symbols for the markers, we can incorporate these markers into the cis by creating "composite" variables like [CaB], which is a single symbol appearing in place of the string CaB. The complete set of composite symbols we need to mimic the grammar of Example 9.4 is [ACaB], [Aa], [ACa], [ADa], [A Ed], [Co], [Da], [Ea], [aCB], [CaB\ [aDB], [aE], [DaB], and [aB]. The productions of our context-sensitive grammars, which we group according to the production from Example 9.4 that they mimic,
c.
for the
language {a
\
ing that A, B, C, D,
are: 1 )
2)
S
->
[ACaB]
5) a[Da] -> [Da]a
[aDB]->[DaB]
[Ca]a->aa[Ca]
Mjl^J
\ca][aB]^aa[CaB] [ACa]a-+[Aa]a[Ca] [ACa][aB] -> [Aa]a[CaB] [ACaB] -> [Aa][aCB]
[CaB]
-»
a[aCB]
a[DaB] ~^[Da][aB] [Aa][DaB] -> [ADa][aB] 6)
->
[ACa]
7) a[Ea] -> [Ea]a
[aE] -* [Ea]
3) [aCB] -> [aDB]
4)
[ADa]
[Aa][Ea]->[AEa]a
[aCB]->[aE] 8)
[AEa] -> a
straightforward to show that S ^>a in the grammar of Example 9.4 if and S ^> a' in the present CSG, where a' is formed from a by grouping with an a all markers (A through E) appearing between it and the a to its left and also grouping with the first a any markers to its left and with the last a any markers to its right. For example, if a = AaaCaB, then a' is [/4fl]a[CaB]. It is
only
if
9.3
CONTEXT-SENSITIVE LANGUAGES
|
225
Linear bounded automata
Now we
A linear bounded a nondeterministic Turing machine satisfying the following
introduce a machine characterization of the CSL's.
automaton (LBA)
is
two conditions. 1) Its
input alphabet includes two special symbols ^ and
$,
the
left
and
right
endmarkers, respectively. 2)
The
LBA
has no moves
symbol over $ or
The
left
from $ or right from
$,
nor
may
it
print another
$.
bounded automaton is simply a Turing machine which, instead of infinite tape on which to compute, is restricted to the portion of the tape containing the input x plus the two tape squares holding the endmarkers. We shall see in Chapter 12 that restricting the Turing machine to an amount of tape that, on each input, is bounded by some linear function of the length of the input would result in the identical computational ability as restricting the Turing machine to the portion of the tape containing the input hence the name "linear bounded automaton." = (Q, Z, T, 5, q 0 $, F), where Q, Z, T, 5, q 0 and F An LBA will be denoted are as for a nondeterministic TM; § and $ are symbols in Z, the left and right endmarkers. L(M), the language accepted by M, is linear
having potentially
—
M
{w
Note
|
w
is
in
(Z
—
{§,
$})*
,
and q 0 $w$
\jf ccqP for
some q
in F}.
endmarkers are on the input tape initially but are not considered word to be accepted or rejected. Since an LBA cannot move off the there is no need to suppose that there is blank tape to the right of the $.
that the
part of the input,
Equivalence of LBA's and CSG's
We now
show
that except for the fact that an
cannot generate
e,
the
Theorem
L
is
9.5
If
LBA
can accept
c
while a
CSG
LBA's accept exactly the CSL's.
a CSL, then
L
is
accepted by
some LBA.
The proof is almost the same as that for Theorem 9.3. The only difference TM of Theorem 9.3 generated sentential forms of an unrestricted grammar on a second tape, the LBA uses a second track of its input tape. Presented with §w$ on its tape, the LBA starts by writing the symbol S on a second Proof is
that while the
track below the leftmost accepting. Next the sentential
LBA
symbol of w.
If
w=
c,
the
LBA
instead halts without
repeatedly guesses a production and a position in the
form written on the second track.
It
applies the production, shifting the
portion of the sentential form to the right whenever the sentential form expands. If,
however, the new sentential form
acceptance. Thus the
LBA
will accept
is
w
longer than w, the if
there
is
LBA
a derivation S ^>
halts without
w such
that
no
THE CHOMSKY HIERARCHY
226
intermediate sentential form
production
in
a
CSG
is
is
derivation S ^> a ^> w, where a the words generated by the
Theorem CSL.
9.6
If
longer than w. But since the right side of any
as long or longer than the
L = L(M)
is
longer than w.
left side,
there could not be a
Thus the LB A accepts all and only
CSG. for
LBA
M = (Q, Z, T,
S,
q0
,
$, F),
then
L-
{e} is
a
an unrestricted grammar from a endmarkers on the LBA tape must be incorporated into adjacent tape symbols, and the state must likewise be incorporated into the symbol scanned by the tape head. The reason for this is that
Proof The proof
TM
Theorem
in
parallels the construction of
The
9.4.
differences are that the
CSG simulated the LBA using separate symbols for the endmarkers, or state,
if
the
it
could not erase these symbols afterward, since that would necessitate shortening
a sentential form, and the right side of every the
left
forms the is
CSG production is at least as long as
The generation of a sequence of pairs, the first component of which terminal string a a 2 ~ a n and the second of which forms the LBA tape
side.
m
{
accomplished by the productions
A2 for all a in
I-
-* [a,
as
an exercise.
If
q
is final,
-> [a, a$],
$}.
The LBA-simulating left
A2
a]A 2y
then
rules are similar to rules 6
and 7
in
Theorem
9.4
and are
we have production [a,
aqp] -» a
$, for all a in I $} and all possible a and P (that is, a and/or /? could include and one tape symbol). Note that the number of productions defined is finite. We also allow deletion of the second component of a variable if it is adjacent to a terminal, by
for
any a and b
in
Z—
{<(:,
$}
The productions shown
and
[a, oi\b
-> ab,
b[a, a]
- ba
all
possible
a's.
explicitly are clearly context-sensitive.
simulating productions can easily be
made
The LBA-
length preserving, so the resulting
M
if and only if it CSG. A proof that any word w but e is accepted by grammar parallels Theorem 9.4, and we omit it. Note that on that or simulate there is no way for the grammar to set up the LBA input input. Thus e is not generated by the grammar whether or not it is in L(M).
grammar is
is
a
generated by the
M
9.4
RELATIONS BETWEEN CLASSES OF LANGUAGES
|
9.4
RELATIONS BETWEEN CLASSES OF LANGUAGES
The
four classes of
languages— r.e.
sets,
referred to as languages of types 0,
1,
227
CSL's, CFL's, and regular sets— are often 2,
and
3,
respectively.
We
can show that
except for the matter of the empty string, the type-/ languages properly include the type-(i + 1) languages for i = 0, 1, and 2. We first need to show that every CSL is
and
recursive,
in fact, there are recursive
CSL's and recursive
Theorem
sets
Every
9.7
languages that are not CSL's.
CSL
is
recursive.
Proof Given a CSG G = (V, T, P, S) and a word w in Z* of length n, we can test whether w is in L(G) as follows. Construct a graph whose vertices are the strings in (V u T)* of length n or less. Put an arc from a to p if a => /?. Then paths in the graph correspond to derivations in G, and w is in L(G) if and only if there is a path from the vertex for 5 to the vertex for w. Use any of a number of path-finding algorithms (see Aho, Hopcroft, and Ullman [1974]) to decide whether such a path exists.
Consider the CSG of Example 9.5 and input w = aa. One way to graph is to start with string S, and at the ith step find the strings of length n or less having a path from S of length i or less. If we have the set for i — 1, say Sf, then the set for i is u {/? a => f$ for some a in 5? and \fj\
Example test for
9.6
paths
in the
^
=
|
0:
{S}
/=1:
{S,
[ACaB]}
i
= 2:
{S,
[ACaB], [Aa][aCB]}
i
=
3:
{S,
[ACaB], [Aa][aCB], [Aa][aDB], [Aa][aE]}
i
=
6:
{S,
[ACaB], [Aa][aCB], [Aa][aDB], [Aa][aE],
i
[Aa][DaB], [Aa][Ea], [ADa][aB\ [AEa]a,
[ACa][aB], aa}
=
we discover that aa is reachable from S we need go no further. number of sentential forms of length n or less is finite for any grammar and fixed n, we know we shall eventually come to a point where no
Since for
i
6
In general, since the fixed
new i
—
sentential forms are added. Since the set for 1,
never
we
shall never
will.
add any new
In that case
w
is
strings,
so
if
i
depends only on the set for yet produced w, we
we have not
not in the language.
THE CHOMSKY HIERARCHY
228
To prove that the CSL's are a proper subset of the recursive languages we prove something more general In particular, we show that any class of languages that can be effectively enumerated, by listing one or more Turing machines that on
halt
all inputs,
each
for
member
of the class,
is
a proper subclass of the
recursive languages.
Lemma any
M M
Let
9.1
that halt
on
l5
all inputs.
2 , ...
Then
be an enumeration of some
there
is
some
set of
Turing machines is not L(M,) for
recursive language that
/.
w is in L if and only if M, does not whose binary representation is w. L is recursive, since given w we can generate and test whether or not w is in L(M,). But no TM on the list accepts L. Suppose L were L(M ; ), and let x be the binary representation of j. If x is in L, then x is not in L(M;), and if x is not in L, then x is in L(M,). Thus L HMj) as supposed. Hence L is a recursive language that is not L(M ; ) for Proof
Let
L be
accept w, where
the subset of (0+1)* such that i
is
the integer
M
i
any;.
Theorem
There
9.8
is
a recursive language that
is
not context-sensitive.
Proof By Lemma 9.1 we need only show that we can enumerate halting TM's for the CSL's over alphabet {0, 1}. Let the 4-tuple representation for CSG's with terminal alphabet {0, 1} be given some binary coding. For example, we could let 0, 8 10 respectively, and let the 1, comma, -,{,}, (, and ) be denoted by 10, 100, +8 ith variable be denoted by 10' machine implementing the Let be the Turing } algorithm of Theorem 9.7 that recognizes the language of the CSG with binary code j. Clearly Mj always halts whether its input is accepted or not. The theorem .
then follows immediately from
.
. ,
,
M
.
Lemma
9.1.
The hierarchy theorem
Theorem
(a) The regular sets are properly contained in the context-free lanThe CFL's not containing the empty string are properly contained in context-sensitive languages, (c) The CSL's are properly contained in the r.e.
9.9
guages, (b) the sets.
Proof
Part (a) follows from the
{0T|n >
1} is
an example of a
noting that every that
is
CSG
is
in
that
every regular is
Chomsky normal form
CFL
grammar
is
not regular. Part (b) is
a
CSG.
2i
{a
i \
a
is
>
CFG, and proved by
1} is
a
CSL
by the pumping lemma. For part (c) every surely an unrestricted grammar. Proper containment follows from
easily
Theorem
CFG
fact that
CFL
9.8.
shown not
to be a
EXERCISES
229
EXERCISES Construct
9.1
+
a) (0
b) 0*(1(0
+
+
c) (((01
and
all
and
grammars
right-linear
for the languages
1)*
1))*
the following normal form for right-linear
grammars:
for left-linear
which
+
10)*11)*00)*
Show
9.2
left-linear
1)*00(0
If
L
a regular
is
- aB
A
productions are of the form
then
set,
or
grammars and
L - {e} is A a for
the analogous result
grammar
in
terminal a and variables
A
generated by a
B.
A context-free grammar is said
9.3
A and
every variable production.
A
terminal
language
is
a,
simple
if it is in Greibach normal form, and for most one string a such that A acc is a has a simple grammar. For example, L = {0T n > 1}
to be simple
there
if it
at
is
|
has the simple grammar:
- OA
S
A-+0AB\1
BNote
more natural
that the
1
GNF grammar for
L>
S-0S£|0B £-> is
1
not simple because there are two S-productions whose right sides begin with
that every regular set not containing
e is
a simple language. [Hint Use a :
0.
Prove
DFA representa-
tion for the regular set.]
A CFG
*9.4
A
^>
is
said to be self-embedding
wAx, and neither w nor x
is e.
if
there
Prove that a
CFL
is
is
some
useful variable
regular
if
and only
A such
if it
has a
that
CFG
It is easy to show that no regular grammar is selfshow that a non-self-embedding grammar may be put in Greibach normal form without making it self-embedding. Then show that for every nonself-embedding GNF grammar, there is a constant k such that no left-sentential form has more than k variables. Finally, show from the above that the non-self-embedding GNF grammar can be converted to a regular grammar.]
that
not self-embedding. [Hint:
is
embedding. For the
*9.5
"if" portion,
Give unrestricted grammars
for '
a)
{ww w |
c) {0'
9.6
|
i
is
is
in (0
+
1)*}
not a prime}
b) {0
1
2
Give context-sensitive grammars
A CSL
i j
d) {0T2''
>
1}
i
>
|
for the
1}
languages of Exercise
9.5,
excluding
t
in (a).
LBA. Show CSL. [Hint: Show that for every deterministic LBA there is an equivalent LBA that halts on every input.] It is, incidentally, open whether every CSL is a deterministic CSL, and whether the CSL's are closed under complementation. Obviously a positive answer to the former question would 9.7
that the
is
said to be deterministic
complement of a deterministic
imply a positive answer to the
latter.
if it is
CSL
is
accepted by
some
deterministic
also a deterministic
THE CHOMSKY HIERARCHY
230
*9.8
Show Show
a)
b)
that every context-free language
is
accepted by a deterministic LBA.
Boolean closure of the CFL's
that the
contained within the class of
is
sets
accepted by deterministic LBA's.
Show
that the containment symbol alphabet.]
c)
Show
*9.9
form ocAp
*S9.10
that every
-» ayp,
Show
in (b) is
CSL is generated by a grammar in which all productions are of the and y are strings of grammar symbols, and A is a variable, a,
where
/?,
CSL's are closed under the following operations:
that the
union
a)
b) concatenation
d) substitution
c) intersection
homomorphism
e) inverse
Show
*9.11
through
a) f)
proper. [Hint: Consider languages over a one-
that the
r.e.
same
e)
positive closure (recall
f)
sets are closed
L+ = U,=
i
L)
under the following operations:
as Exercise 9.10.
Kleene closure.
9.12 a)
Show
that
all
because no b)
Show
the undecidable properties of
CFL's mentioned in Sections 8.5, 8.6, and "= X*" is trivially decidable
undecidable for CSL's, with the exception that
8.7 are
that
CSL contains c. "= I + " is undecidable
S9.13
Show
that
*S9.14
Show
that every
undecidable whether a given
is
it
for CSL's.
r.e.
set
h(L\ where h
is
is
a
CSL
is
empty.
homomorphism and L
a CSL.
Solutions to Selected Exercises 9.10
The proofs
are similar to the proofs of
Theorems
However, there is one problem with which we have to construction. Suppose Gx
=
(Vu
Tu P u S
CSG's generating L, and L 2 grammar
are
G4 =
(Vr
u V2 u
,
x
,
respectively. In
{S 4 }, Ti
u T2
,
6.2,
and
6.3 for
CFL's.
Consider the concatenation
G 2 = (K2 T2 P 2 S 2 )
and
)
6.1,
deal.
Theorem
P,uP
2
u
,
6.1 for
{S 4
,
CFG's, we constructed
- 5, S
2 },
S4 )
L 2 This construction is correct for CFG's, provided Vr and V2 are disjoint. For CSG's, however, we could have a production a -> ft in Px or P 2 that was applicable in a sentential form of G 4 say y<5, where Sx=>y and S 2 =>&> in such a position that a straddles the boundary between y and <5. We might thus derive a string not in L x L 2 Assuming Vx n V2 = doesn't help, since a could consist of terminals only, and of course we cannot assume that Tx o T2 = 0. What we need is a normal form for CSG's that to generate Li
.
,
.
0
allows only variables on the
G=
(K,
T
y
P, S) be a
left
sides of productions.
CSG. Construct
G'
=
(V\ 7,
Such a lemma is easy to prove. Let where V consists of V plus the
P', 5),
EXERCISES
Aa
variables a'
-p
A a and ,
F consists of productions A
for each a in T.
each ol->
for
ft
in P,
where
/F is similarly related to
Now,
if
we assume
-> a for each a,
and production
a with each occurrence of a terminal a replaced by
a' is
/?.
G and G 2
that
a
231
have disjoint
x
normal form, the constructions of Theorem
sets of variables
and are
in the
above
union and concatenation carry over to
6.1 for
CSL's. Positive closure presents another problem.
analogy with Theorem
in
If,
6.1,
we
construct
G5 =
u{S 5 TU P u{S 5 -S,S 5 |Si}, S 5
(K,
},
X
),
we have not avoided the problem of the potential for applying a production a -> /? in such a way that it straddles the strings derived from two or more instances of S What we can do is create grammar G\, which is G with each variable A replaced by a new symbol A'. Then we construct the grammar G 5 = (F5 7i, F5 S 5 ), where V5 consists of the variables of G and x
.
x
,
Gi, plus the symbols S 5 and S'5
;
x
,
F
5
consists of the productions of
S5
-» Si S's \S l
S'5 -*
As no
CSL
string derived
<
x
from one instance of Si or
L
based proofs. Let \h(a)\
I
£,
homomorphism,
Inverse
S\ S s Si
we can never have symbols derived from two instances of S or two adjacent, and we may be sure that each production of G 5 is applied to a
cbntains
instances of Si
k for any
be a
a.
CSL
Si.
intersection,
accepted by
and substitution are best handled by machineLBA and h a homomorphism. Suppose that
M
Then we may construct LBA M'
input x and computes h(x) storing k symbols per y
\h(x)\
<
For
LBA
M
k \x
|.
Then M' simulates
intersection, 3
Gj and G\, plus
that treats
let
its
h~ (L) as follows. M' takes l
for
There
cell.
is
M on h(x\ accepting M accepts.
L and L 2 x
its
sufficient space, since
if
be CSL's accepted by LBA's
M,
and
M
2-
Construct
were written on two tracks. That is, we identify input If some sequence of choices of track, 3 simulates Y
input as
if it
M M
M
symbol a with [a, a]. On the first move by M, causes it to accept, 3 begins to simulate 2 on the second track, accepting if n L2 2 accepts. Thus 3 accepts L For substitution into CSL L ^ Z of CSL's La for symbols a in Z, construct an LBA that works as follows. Given input a a 2 '" a n nondeterministically guess which positions end strings in some La and mark them. If we guess that a,a + •• a i is in some particular La simulate the LBA for La on that substring. If a } is in La replace it by a. If all our + guesses are correct, take the resulting string in E* and simulate an LBA for L on it, accepting a, a 2 " a n if that LBA accepts.
M
M
M
.
.
x
,
x
,
l
,
,
i
• •
,
m
9.13
It is
easy to design an
LBA
to accept the valid computations of a given Turing
machine. Thus the emptiness problem whether a given CSL is empty. 9.14
Let Li be an
accepting
L and x
L2 =
{wc'
r.e. set
and
c a
for
Turing machines
symbol not
in the
is
reducible to the question
alphabet of
L
x
.
Let
Mj
be a
define
|Mi accepts w by a sequence of moves
moves more than
i
in
which the head never
positions to the right of w}.
TM
232
THE CHOMSKY HIERARCHY
Then L 2
is
accepted by an
LBA
that simulates
ever goes beyond the sequence of
c's
on
its
M
input.
lt
treating c as the blank
and halting
We have only to show that L = t
h(L 2 )
if it
for
some homomorphism h. Let h(a) = a for all symbols in the alphabet of L, and h(c) = e. Combining Exercise 9.14 with Theorem 9.9, we observe that the CSL's are not closed under homomorphism. This may seem paradoxical, since Exercise 9.10 claimed the CSL's were closed under substitution. However, homomorphism
is
not a special case of substitu-
by a CSL, as a CSL may not contain e. In particular, for h defined above, h(c) = e is not a CSL. The CSL's are, however, closed under homomorphisms that do not map any symbol tion
to
6.
BIBLIOGRAPHIC NOTES The Chomsky hierarchy was defined in Chomsky [1956, 1959]. Chomsky and Miller [1958] showed the equivalence of regular grammars and regular sets. Kuroda [1964] showed the equivalence of LBA's and CSG's. Previously, Myhill [1960] had defined deterministic LBA's, and Landweber [1963] showed that the deterministic LBA languages are contained in the CSL's. Chomsky [1959] showed that the r.e. sets are equivalent to the languages generated by type-0 grammars. Fischer [1969] gives some interesting characterizations of the CSL's. Hibbard [1974] discusses a restriction on CSG's that yields the context-free languages. Additional closure properties of CSL's are studied in Ginsburg and Griebach
[1966b] and Wegbreit [1969]. Basic decision properties of CSL's are given in Landweber [1964].
CHAPTER
10
DETERMINISTIC CONTEXT-FREE
LANGUAGES
We now have machine models that define each class of languages in the Chomsky
—
finite automata and Turing machines exhibit no difference in accepting ability between their deterministic and nondeterministic models. For the linear-bounded automaton, it is unknown whether the deterministic and nondeterministic varieties accept the same class of languages. However, for pushdown automata, we do know that the
hierarchy. At the extreme ends of the hierarchy, the machines
—
deterministic
PDA's accept a
family of languages, the deterministic context-free
languages (DCFL's), lying properly between the regular sets and the context-free languages. It turns out that the syntax of many programming languages can be described by means of DCFL's. Moreover, modern compiler writing systems usually require that the syntax of the language for which they are to produce a compiler be described by a context-free grammar of restricted form. These restricted forms
almost invariably generate only DCFL's.
important of these restricted forms
We shall meet what is probably the most
—the LR-grammars. The LR-grammars have
the property that they generate exactly the If
a compiler writing system
is
DCFL's.
to be used,
it
is
generally necessary that the
language designer choose a syntax for his language that makes
it
a
DCFL. Thus it
whether a proposed language is in fact a DCFL. If it is, one can often prove it so by producing a DPDA or LR-grammar defining the language. But if the language L is not a DCFL, how are we to prove it? If L is not a CFL at all, we could use the pumping lemma, perhaps. However, L will often be a is
useful to be able to determine
CFL
but not a DCFL. There is no known pumping lemma specifically for DCFL's, so we must fall back on closure properties. Fortunately, the DCFL's are closed under a number of operations, such as complementation, that do not
233
DETERMINISTIC CONTEXT-FREE LANGUAGES
234
preserve
CFL's
in general.
Thus,
if
L is
a
CFL but
complement
its
is
not, then Lis
DCFL.
not a
Sections 10.1 through 10.4 develop various closure properties of DCFL's.
Section
covers decision properties. Sections 10.6 and 10.7 treat
briefly
10.5
LjR-grammars.
NORMAL FORMS FOR
10.1
whenever
1)
M = (Q, E, T,
PDA
Recall that
DPDA's
X)
S(q, a,
is
2) for each q in Q, a in
S,
q0
nonempty
£ u
Z0
,
for
F)
is
some a
X
and
{e}
,
deterministic
if:
then S(q,
in Z,
in T, (5(g, a,
X)
e,
X) is empty, and
contains at most one
element.
Rule
(1) prevents
a choice between using the next input or making an e-move.
we
(2) prevents a choice on the same input. For deterministic PDA's hereafter write 3(q, a, X) = (/?, y) rather than S(q, a, X) = {(/>, y)}.
Rule
Like PDA's
we can put DPDA's
in general,
shall
a normal form where the only
in
stack operations are to erase the top symbol or to push one symbol. This form will
be proved
in the next
two lemmas. The
first
lemma shows
never push more than one symbol per move, since
it
that the
DPDA
need
can push a string of symbols
one at a time, using c-moves. The second lemma shows that DPDA's need never change the top stack symbol. Changes are avoided by storing the top stack symbol in the finite control and recording changes to it there. The reader who grasps these ideas should skip to the start of the next section.
Lemma that
if
DCFL is L(M)
Every
10.1
X) =
S(q, a,
(p, y),
Proof If S(q, a, X) = (r, y) and new nonaccepting states p u p 2
y
|
% 1
< < i
n
—
3
and
£,
<5(p„_ 2 ,
stack, the revised
enters state
but
Lemma if
Proof
r,
10.2
2, let
y
= Y Y2
• •
•
q0
,
Z0
,
F) such
Yn where n >
3.
Create
q on input
a,
with
,
t
a,
X) =
(p u
Yn _ Yn y t
it
Every
X) =
now
=
^2)
—
(
(p l+lf r> ^1 ^2)-
DPDA
takes n
—
moves
1
DCFL is L(M)
still
for a
Thus,
replaces to
do
in state
X
y
with
(p, y),
to simulate
M'
X
y
X
and
DPDA M = (Q, £, T, X (no stack
X
g 0 Zo> F) such move), or of the
(5,
,
= (Q\ £, F, ^'0 0 F') satisfies Lemma while keeping the top stack symbol of M' in
Assume L = L(M'), where M'
We construct M
Yn =
Y Y2
so.
then y is either £ (a pop), (a push) for some stack symbol Y.
S(q, a,
YX
form
10.1.
S,
p n -i, an<3 redefine
n.,)
on top of the
that
(g, Z, T,
define 5( Ph
for
DPDA M =
2.
>
\
,
Then
for a
<
then \y\
,
,
10.2
M's
control. Formally,
let
Q=
[g'0 ,
where i)
Q' x
Z0
a
is
If S'(q, a,
stack of It is
(p, e), its
X) =
its
does not If 5'(q, a,
=
then for
10.2
all Y, <5([g,
X],
a,
Y)
=
Fu
([p, 7], £). If
up the symbol popped
(p t Y), then for all Z, S([q,
top stack symbol,
X\
a
9
for
Z)
M records the change
X) = (p, YZ), then for all W, S([q, X], M' grows, pushes a symbol onto its
M
show by induction on
{Z 0 },
=
its
M' pops its
control.
([p, Y],
Z). If
M'
own
control but
([p, 7],
ZJ^). If the
in its
the
a,
W) =
stack.
number of moves made
X 0 )\jr.(q,c,X X 2 --X l
that
n)
if
([q'0 ,
Thus L(M)
T=
and
alter its stack.
easy to
and only
Fxr
235
F. Define 5 by:
in
stack, picking
fao, w, if
F=
Xol
new symbol not
M pops
changes
iii)
F, g 0
X) =
If 5'(q, a,
stack, ii)
CLOSURE OF DCFL'S UNDER COMPLEMENTATION
|
=
X 0l
w,
Z0
)
bS"
^i],
^,
*2*3
-X
n
Z0
).
L(M').
CLOSURE OF DCFL's UNDER COMPLEMENTATION
that the complement of a DCFL is also a DCFL we would like to use the approach employed in Theorem 3.2 to show closure of the regular sets under complementation. That is, given a DPDA we would like to interchange final and nonfinal states and then be able to claim that the resulting DPDA accepts the complement of L(M). There are two difficulties that complicate the above approach. The first difficulty is that the original DPDA might never move beyond some point on an input string, because on reading input w either it reaches an ID in which no move is possible or it makes an infinity of moves on oinput and never uses another input symbol. In either case, the DPDA does not accept any input with w as a prefix, and thus a DPDA accepting the complement should accept every string with prefix w. However, if we simply changed final and nonfinal states, the resulting DPDA still would not move beyond w and therefore would not accept strings
To show
M
with prefix w.
The second
DPDA may
difficulty is
due to the
fact that after seeing
a sentence x, the
moves on e-input. The DPDA may be in final states after some of these moves and in nonfinal states after others. In this case, interchanging the final and nonfinal states results in the DPDA still accepting x.
make
several
DETERMINISTIC CONTEXT-FREE LANGUAGES
236
DPDA's
Forcing
to scan their input
To remove the first difficulty, we prove a lemma stating that, given a DPDA M, we can always find an equivalent DPDA M' that will never enter an ID from which it will
not eventually use another input symbol.
M
Lemma
DPDA M' such that
10.3 Let be a DPDA. There exists an equivalent on every input, M' scans the entire input.
Proof
We can
assume without
loss of generality that for every accessible
ID and
M has a next move. Otherwise, one can add an endmarker on the stack to prevent M from erasing the stack entirely and thereby halting, without input symbol,
scanning, the entire input. In addition, one can add a "dead state," d so that for
M
any combination of state, input symbol, and stack symbol for which has no next move, either using the input symbol or an £-input, a transfer to state d occurs. On any input symbol, the only transition from state d is to state d and no change of the stack occurs. Of course, d is not an accepting state. Now, if for every ID and input symbol, has a next move, then the only way in which might never reach the end of its input is if in some ID, makes an infinity of moves on e input. If in state q with Z on top of the stack, makes an infinity of 6-moves without erasing the symbol Z, then let instead enter the dead state d. This change cannot affect the language accepted unless entered an accepting state at some time during the infinite sequence of e-moves. However, in this case, we introduce a new final state /, letting 5(q, e, Z) = (/, Z) and 5(f, e, Z) = (
M
M
M M
M
M
M
,
y
,
F). Define
M' = {Q\j
{q 0 ,
T u {X 0
},
d\
q'
0,
X 0y F u
{/}),
where: 1) d'(q 0 , e,
2) If for
X0 = )
some q
(q 0 ,
in Q,
Z 0 X 0 X 0 marks the bottom of the a in Z and Z in T, S(q, a, Z) and S(q,
stack.
).
e,
Z) are both empty,
then
=
(
Z).
X0 =
(
X0
S'(q 9 a,
Also for
all
q in
Q
and a
in Z, 5'{q, a,
Enter the dead state 3)
4)
if
Z)
no move
)
is
).
possible.
S'{d, a, Z) = (d, Z) for all a in Z and Z in T u {X 0 If for q and Z and all there exist q and y for which (g, Z) d'(q Z) = (d Z) provided no q is final and 5'(q, c,Z) = }.
i
{
6,
f
(q h
e,
y
t
),
then
(f Z) whenever one or more of the g,-'s is final. (Note we have not claimed that we can determine whether d'(q, e, Z) should be (d Z) or (/, Z). However, there are y
6,
y
t
y
10.2
only a
finite
CLOSURE OF DCFL'S UNDER COMPLEMENTATION
|
number of such
choices there exists a
We
=
Z)
c,
For any q
6)
show
shall subsequently
5) S'(f9
Z)
(d,
in Q,
a
for all
Z
£ u
{£},
in
rule (2) or (4), then S'(q, a,
The argument preceding L(M')
=
prefix
x of xy,
To prove
L(M).
and from ID is
y,Z l
(q,
not possible that
Z2
M
1
XQ
Z) has not been defined by
a,
the formal construction should convince us that uses
all its
input, suppose that for
some proper
X Q hg- (q, y,Z Z 2 ~ Z X 0 \ Zk X 0 no symbol of y ever consumed. By rule (2)
m
Z2
effective.)
9
k
l
)
is
),
M' halts. By rule
must erase
is
(4), it is
,
.
.
. ,
not possible that
M' makes an
Z Therefore M must eventually erase Z v Zk and eventually enter an ID (q\ y, X 0 By rule t
.
x
).
X
f^r (d,
now
Let us
made
)
observe that the construction
in rule (4)
M
shall
Assume without loss of generality that compute more information than is actually needed.
mine
for
effective.
1)
{q, e,
2)
(fc e .
3) (g,
£,
each q and p
in
Q
and
it
infinite
0 ), where y = ay' and a is in Z. Thus M' did not y\ read past x as supposed, and M' satisfies the conditions of the lemma. (2) fa', y,
of
set
be the desired one.
will
T u {X 0 }.
sequence of c-moves without erasing Similarly
For each possible
DPDA's
and Z in T, if 5'(q, Z) = 5{q a, Z).
xy,
~
of these
that the construction in
M'
that
(q'0 ,
decisions to be made.
DPDA. One
237
Z
in T,
Lemma
of is
in
10.3
fail
can be
normal form.
In particular,
to
we
We
deter-
whether
Z)tfr(p, £,Z),
z
(P.
)
Z) ^-
^
4
(p, £, y) for
some
y in
r*
For each and Z we can determine from (3) whether consumes the next symbol of y without erasing Z.f If determine
if
M erases Z.
If neither
M ever enters a state that not, then
event occurs, then either
from
(2)
M' must
we can
enter the
e, Z) dead state by rule (2), or rule (4) applies and again (3) tells us whether is (d, Z) or (/ Z). Construct Boolean-valued tables Tu T2 and T3 such that for i= 1, 2, and 3, T}(q Z, p) is true if and only if statement (i) is true for q, Z, and p. The tables are initially all false and are filled inductively. The basis is to set T3 (q, Z, p) = true if 8{q, £, Z) = (p, YZ), to set T,fa, Z, p) = T3 (^, Z, p) = true if %, e, Z) = (p, Z), and to set T2 (<7, Z, p) = true if S(q, e, Z) = (p, £). The inductive inferences are: ,
y
Whenever
1)
t
a)
if
b)
if
Note
move
is
that
T2 (r, T2 (r,
S(q,
Y, 5) Y, s)
Z) = (r, YZ), then and T2 (s, Z, p) are true, and ^(5, Z, p) are true,
£,
by the construction of
to be made.
Lemma
10.2,
set
T2 (q,
set
T^q, Z, p)
Z, p)
= =
true; true;
the state p alone determines whether a non-£ input
DETERMINISTIC CONTEXT-FREE LANGUAGES
238
T2 (r, y, s) and T3 (s, Z, p) are true, or T (r, T3 (q, Z, p) = true; if T3 (r, y, p) is true, set T3 (g, Z, p) = true.
c) if
x
Y, s)
and
T3 (s,
y, p) are true,
set
d)
Whenever
S(q,
a)
if 7;(r,
b)
if
2)
c)
T2 (r, if T3 (r,
We
(r,
Z) then
Z, p)
is
true, set
Z, p)
is
true, set
Z, p)
is
true, set
T {q, T2 (g, T3 (g,
Z, p)
x
Z, p) Z, p)
= = =
true; true; true.
leave as an exercise an efficient algorithm for filling in the true entries in
the tables
and
basis
=
Z)
e,
and proving that the only true rules (1) and (2) above.
entries are the ones that follow
from the
Closure under complementation
We are now ready to prove that the DCFL's are closed under complementation. To do so we must deal with the second problem mentioned at the beginning of this
DPDA makes a sequence of The solution is to modify the state. The second component re-
section; the possibility that after reading input w, the
e-moves, entering both final and nonfinal states.
DPDA
by adding a second component to the
cords whether a
final state
complement
DPDA
of the original
time a true (non-e)-input was used enters a final state of
a move.
in
its
has been entered since the
DPDA
If not, the
own, just before
it is
last
accepting the
ready to use the next
true input symbol.
Theorem Proof
Let
M' = (Q,
Let
F=
The complement
10.1
M = (Q, E,
Z, T,
{[
Z 0 F)
2'
=
3]\qm
,
Q},
{[
,
be a
=
,
DCFL F) be a
DPDA
DPDA
j[g 0 , 1]
if
q0
2]
if
q
is
in
DCFL.
a
satisfying
simulating
and &
|
=
1,
Lemma
10.3.
Let
M, where
2,
or
3}.
let
q
in [q, k] is to record,
has entered an accepting
is
Q
k]
and
^°
The purpose of k
Z0
T, 5, q 0 ,
0,
5', q'
of a
state. If
is
0 is
in
F;
not
in F.
between true inputs, whether or not
M has entered an accepting state since the
M
last
M has not entered an accepting state since the true when M reads a true input symbol, then M' simulates the move of M and changes k to depending on whether the new state of M or true input, then k input, then k
=
=
2. If
1.
If
k
=
=
2,
last
1
1
is
or
is
not
in F. If k
changing k to
1
or
2,
M'
first
2,
changes k to 3 and then simulates the move of M,
depending on whether the new state of
M
is
or
is
not in F.
10.2
Thus, 1) If
CLOSURE OF DCFL'S UNDER COMPLEMENTATION
|
defined as follows, for q and p in Q, and a in £.
S' is
%
=
Z)
£,
then for
(p, y),
=
/c
1
or
/c],£,Z)
where
239
=
/c'
1 if
=
Z)
2) If b(q, a,
=
/c
or p
1
=
([p,
F; otherwise
in
is
2,
/c'],
k'
y),
=
2.
for a in E, then
(p, y),
Z)
2], 6,
=
([
3],
Z)
and
where
=
/c
1
=
Z)
8'([q, 1],
fl,
or 2 for p in
F
5'([q, 3], a,
Z)
-
([p,
fc],
y)
or not in F, respectively.
We claim that L{M ') is the complement of L(M). Suppose that a x a 2 an is in L(M). Then enters an accepting state after using a n as an input. In that case, the second component of the state of M' will be 1 before it is possible for M' to use a '
*
M
M' does not accept (enter a state whose second was the last true input used. If a a 2 a n is not in L(M), by Lemma 10.3 M' will some time after reading a n have no e-moves to make and will have to use a true input symbol. But, at this time, the second component of M"s state is 2, since a a 2 a n is not in L(M). By true input after a n
component
'
Therefore,
.
3) while a n
is '
'
x
x
rule (2),
M'
will accept before
attempting to use a true input symbol.
Before concluding this section Corollary
Every deterministic
accepting state,
Proof
It
is
CFL
state the following corollary.
which k
is
=
accepted by
is
may make no move on
The statement
state (one in
we
DPDA
some
implicit in the proof of Theorem 10.1. 3)
possible to use
no £-move
Theorem
is
that, in
an
t-input.
Note
that in a final
possible.
10.1
to
show
certain languages not to be
DCFL's.
Example grammar
The language L
10.1
=
{04^2*
S^AB\CD B-+2B\e
=fc
is
k}
is
a
CFL generated by the
A->0Al\e
C-0C|e
D-+W2\e
DCFL. If it were, then L would be a DCFL and hence a CFL. L = L n 0*1*2* would be a CFL. But L = {tfl J2 k \i + j and k). A proof using Odgen's lemma similar to that of Example 6.3 shows that L not a CFL, so L is not a DCFL.
However,
L is
By Theorem j
\i=j or ; =
not a
6.5,
x
x
x
DETERMINISTIC CONTEXT-FREE LANGUAGES
240
PREDICTING MACHINES
10.3
For a number of other closure properties of DCFL's we need a construction which the stack symbols of DPDA
in
M are modified to contain information about a
automaton A. The information associated with the top stack symbol and p of A, whether there is some input string that to accept when started in state q with its current stack and simultan-
certain finite tells,
each state q of
for
causes
M
A
eously causes
Formally,
A = (Q A
to accept
let
<5/i>
,
M
if
started in state p.
M = (Q M ,H,r,S My q
Po> Fa)-
Then
0y
Z 0f F M )bea. normal form DPDA and
M and
7c(M, A), the predicting machine for
defined by (Q M E, T x A, <5, q 0 , The intention is that if n(M, A) ,
X 0y F M \ where A ID
is in
is
x, [Z, p]y),
(r,
A,
the set of subsets of Q M x
is
QA
.
then p consists of exactly
and (q, p) such that there is a w in E* for which S A (p, w) is in F A ZP) fa (s, e, a) for some s in F M and a and /? in T*, where /? is the string of first components of y. To define d and X 0 we need additional notation. Let q Z be with q and Z made the start state and start symbol respectively. Let A p be A with p made the start state. Then by our usual notation, those pairs
,
(q y w,
M
L(M q .z) =
W
w,
Z)
|
^
(5, c,
some
7) for
s in
F M and
M
y in T*}
and L(/4 P )
Let r,
N (M q Z r
that
)
be the
set
=
{w
w)
is
in
I
of strings that cause
M
q
F A }.
Z to erase its stack
and enter
state
is,
N (M q z = r
,
)
{w\(q, w, Z)H£-(r,
6, e)}.
L(M q Z is a DCFL and L(^l p ) is a regular set. It is also true that N r (M q Z ) is DCFL. In proof, modify M to place a marker Y0 at the bottom of stack and then simulate M in state q with stack ZY0 If 70 becomes the top stack symbol, then accept if the state is r and reject if not. Finally, let L5 (A p ) = {w d(p, w) = s}. Clearly Ls (A p is regular. Now we may define 3(r, a, [Z, p]) for r in g M a in E u {c}, Z in T, and p. in A Surely
)
a
.
\
)
,
y
as follows. 1) If
d M (ry
Z)
a,
=
(5, c),
then
^(r,
a,
[Z,
/*])
=
(s, c).
Note that where
influence the action of 7r(M, A), except in rule (3) below, the second 2) If
d M (r,
a,
3) If 3 M (r, a,
does not it
influences
component of the stack symbol pushed.
= Z) = Z)
(s,
Z), then
(5,
YZ), then
<5(r, a, <5(r,
[Z, p\)
a, [Z,
of those pairs (q p) such that either a) L(M q y) n U(A p ) is nonempty, or f
= (s, [Z, /<]). = (s, [Y, v][Z, //]), where v consists
10.3
b) there
some
is
and u
QM
in
t
in
QA
N (M t
PREDICTING MACHINES
|
241
such that
UA
n
q, Y )
P)
nonempty and (r, u) is in Note that L(M qY ) and N (M qY ) are CFL's, and L(/l p ) and Ln(,4 ) are regular, p so by Theorems 6.5 and 6.6, we may determine whether the languages mentioned in (a) and (b) are empty. is
t
X0 =
Finally, let
[Z 0
where
fi 0 ],
,
= Lemma
n(M, A) as defined above has the property that
10.4
[Z 0
(q 0 , x,
and only
if
b) for
i
y,
[z l9
\i
x
\Z 2
n 2]
,
•
[Z„,
•
nn ])
Zn
x
),
and
n,
=
ft
(r,
h
Z Z2
y,
)
< <
1
fi 0 ])
,
if
Z Q \± (r,
a) (q 0 , x,
p)\L(Nf qtZo )nL(A p )*0}.
some
fc> r
P)
F M and
s in
w,
w,
(4,
I
y in T*,
Z Z i+ t
•
•
•
,
Zn fa (s,
and S A (p w) y
is
FA
in
y) for
e,
)
some
}.
(a) is obvious, since n(M, A) simulates M, carrying along the second component of stack symbols but not allowing them to influence anything but other second components of stack symbols. We prove (b) by induction on /, starting at = n and working down. The basis, i = n is easy. Zn must be Z 0 since is in normal form. The definition of X 0
Proof
i
M
,
plus rule (2) in the definition of S gives us the basis.
For the induction, suppose the from
as v
j
is
constructed from w,
(q,
F M and
for s in
Z
i
(q,
is
,
p)
is
in
(q,
for t
Zi )
If
5 in
FM
Conversely,
F A and
Z
t
is
(q,
erased,
is
f
(t, e,
Also
.
and also
rule (3b), (q y p)
which
Z
in
is
n
if (q,
w, Z,
Z
t
+
i
+
1.
Then
Suppose there y
6,
was constructed is
some w such
that
y)
F A Then there are two cases depending on whether w is in L(M q Z ) and also in L(A p so by rule (3a), let w = w w 2 where .
),
,
x
and
let
(^(p, Wj)
By
(r,
=
w 2 Z 1+
w,
,
1
so ^(w,
-Z
Z,. +2
w2
)
is
the inductive hypothesis,
H&-
(s, e,
y)
F A Then
in
is
.
u)
(f,
n)
is
in
+
j.
in
Thus by
.
p) f
true for
Z,Z i+1 --Zn )\±-(s
e)
in L^/lp).
is in
is
in rule (3).
not, then
it is
w l9 Z^ f£-
some
N (M q in
S A (p, w)
ever erased. If
result fx
never erased.
in
is •
j
•
•
^ by rule
Z„) ^-
(3a),
(5, £,
If (q, p) is in
then there
y) for 5 in
is
a
F M by ,
w
such that d A (p, w)
is
a sequence of moves in
& by rule (3b), then there exists w
x
in
Z*,
f
in
1
DETERMINISTIC CONTEXT-FREE LANGUAGES
242
Z
= u, and (t, u) is in fi i+ v u 'mQ A such that (q, w l9 (t, e, e), 5 A (p, ) H§r ••• the inductive hypothesis, there exists w 2 in £* such that (t, w 2 , i+1 i+2
Q M and By
Z
Z„) Z„)
f
(5, e,
^
y) for
some
s in
and ^(p,
(5, e, y),
F M and ^(w, w 2 ) w 2 is in F*, so (q ,
)
y
is
F A Thus
in
.
p) belongs in
pL t
.
Z
w 2 Z,ZI+
(g,
• • •
,
This completes the
induction and the proof of the lemma.
Example
Let
10.2
M=
({q 0 ,
q l9 q 2 qj, ,
{X,
{0, 1},
Z0
5 M q0 ,
},
,
Z0
,
{g 3 }),
where (5
Also
let /4
=
M (go>
0,
Z0 = )
fao,
XZ 0 \
S M (q l9
0,
X) =
(q 2 ,
e),
S M (q 2
0,
X) =
(g 2 ,
e),
Z0 =
(g 3 ,
e).
Mflo.
0,
*) =
(q 0 ,
XX)
Mflo>
1,
X) =
(q u
XX),
<5m(<7i> 1,
=
(<7i>
XX),
({p 0 , Pi}, {0, 1},
y
,
S M (q 2y
£,
)
& A p0 {p 0 }), where ,
,
=
Pi>
^(Po,
0=
Po>
MPi»°) =
Po.
0=
Pi-
0)
Observe that
L(M) = L(M 90>Zo ) = Also
0 and L(M
L(M quZo = )
g2
is,
strings with
strings with
Zo )
=
L(A) that
,
{0'
IV
I
/
+j =
k,
= L(M g3 Zo = .
=
L(A po )
)
i
>
0 and
j
>
0}.
{c}.
(1+01*0)*;
and L(A pi = 1*0(1 + 01*0)* that is, L(M qoZo n L(A po contains strings such
an even number of
0's,
an odd number of 0's. Thus
)
)
)
L(M qoZo )nL(A pi ) contains strings such as 01110000. L(M q2 Zo ) n LiApJ and L(M g3 Zo ) n L(/l po ) each contain e, but the other four intersections of the form L(M q Zo ) n L(/t p; ) are empty. Thus the start symbol of as
and
00110000,
.
n(M, A)
is
[Z 0
,
jU 0 ],
where Mo
Now
let
=
{(4o> Po), (<7o> Pi), ( Po)> (?3> Po)}-
us compute 5{q 09
To do
so
we need
0,
[Z 0 Mo])
to deduce that
,
=
L(M q
.
fao. [X, v][Z 0 ,
x)
=
0 for
1
=
/< 0 ]).
0, 1,
or
2,
since
we cannot
on the stack and cannot write Z 0 if it wasn't there originally. Thus there is no contribution to v from rule (3a). However, L(M q3tX ) n L(A P0 ) = {£}, so we add (q 3y p 0 ) to v.
accept without a
Z0
10.4
Consider rule
ADDITIONAL CLOSURE PROPERTIES OF DCFL'S
|
243
(3b).
{0W
+ ; = k - 1 and j > 0}, Nq2 (MquX ) = {W\j = k-l},
N„(Mqo x = )
,
|
/
and
*q2 (M
q2 ,x)
= {0}.
The other sets of the form Nq .(M qj X ) are empty. Also, LPi(A ) is all strings with an Pj even number of O's if i = j and all strings with an odd number of O's if i ± j. Since N q .(M q x ) is nonempty only if i = 2 and ; = 0, 1, or 2, we can only apply .
L po(^po) and (ft, Pi) for
v.
N
the pair (ft, p 0 ) is chosen from fi 0 We see (M qo X ) n q2 qo>x) n L Po (A Pl ) are both nonempty, yielding (ft, p 0 ) and q2 ( Similarly (M qitX ) n L Po (,4 ) and (M 9ltX ) n L (/l ) are nonqi
rule (3b) successfully
if
.
M
N
N
po
empty, yielding (ft,p 0 ) and yielding (ft,
for
v,
(ft,
Pi) for
Also,
v.
N
N
po
q2
Pl
(M q2tX ) n L Po (^ Pl )is nonempty, q2
but
Thus,
=
v
10.4
{(ft, Po), (ft, Pi), (ft, Po)> (ft, Pi), (ft, Pi), (ft, Po)}.
ADDITIONAL CLOSURE PROPERTIES OF DCFL's
Using the idea developed
in the
previous section
we can prove a few closure we present we can define acceptance for a
properties of deterministic context-free languages. Before proceeding,
one more technical lemma. The lemma asserts that DPDA by a combination of state and the top stack symbol; the language so defined
is still
a deterministic language.
Lemma 10.5 Let M = (Q, I, T, 3, ft, Z0 F) be a DPDA. Q x T, that is, pairs of state and stack symbol. Define ,
L=
{w I
Then L Proof
is
a
We
(ft, w,
Z0
\*f (q,
)
c,
Zy)
for
some
Let
Z)
(g,
B
be any subset of
in B}.
DCFL. define a
DPDA
M
M' =
',
accepting L, as follows.
(Q', S, T,
S\ ft,
in
and
Z0
,
F),
where Q'
=
{q,
q"
k
G}
F'
=
{g" q in Q}. |
same moves as M, except that M' moves from an unprimed state to a singly primed state and then, on e-input, moves back to the corresponding unprimed state, either directly or through a doubly primed version. The latter case applies only if the pair of state and top symbol of the stack is in B.
M' makes
the
DETERMINISTIC CONTEXT-FREE LANGUAGES
244
Formally, 1) if
<%,
= (p, y), then a, Z) = (p', y); Z) = (q Z) provided (q Z) is not in B\ Z) = (q\ Z) and <%", £, Z) = (g, Z) if (g,
a,
2)
(5'(
3)
5\q\
e,
Z)
y
9
Z)
in £.
is
Quotient with a regular set Recall that the quotient of
L
{x there exists |
In Exercise 6.4 regular
set.
(See
L2
with respect to
x
w
L2
in
,
denoted
xw
such that
is
L
l
L
in
/L 2
is
>
x }.
we claimed that the CFL's were closed under quotient with a Theorem 1 1.3 for a proof.) We shall now prove a similar result for
DCFL's.
Theorem
Let
Proof
Let
10.2
L
L = L(M)
DCFL and R a M a DPDA that
regular
for
always scans
set.
R = L(A) for finite automaton A. Suppose M = = (Q A Z, p 0 F A Then let
Let
and
,4
Then L/R
be a
,
DCFL.
a
entire input.
its
(Q M Z, T, ,
<5
M
,
q0
,
Z0 FM ,
)
).
,
=
Af'
is
(Q M I, ,
T
x A,
[Z 0
^o,
<5,
,
Ho],
FM
)
M
and A. Let # be the subset of be 7c(M, /i), the predicting machine for x x containing such A) all that (r [Z, /z]) Qm (4, (g, p 0 ) is in ,u.
Then by Lemma ^i is
a
=
10.5,
{* fao> *> [Zo> I
DCFL. By Lemma
L = x
s
is
in
FM
,
some w
{x for |
y' is
(g, £,
[Z, juty)
and
p0 )
(g,
is
in h}
10.4,
and where
M 0 ]) |^
Z*, (g 0 x, ,
w, Zy')[-£-(s,
(q,
the
in
first
Z 0 ^)
(
Z/)
£, /?),
components of
y,
and ^(po,
vv) is in
FA
}.
Equivalently,
L,
That
is,
Lj
=
=
{x for |
some w
L/K. Thus L/fl
MIN and MAX We now show two
is
in
a
Z*,
xw
is
in
L(M) and w
is
in L(A)}.
DCFL.
operations that preserve
DCFL's but not
arbitrary CFL's.
Recall that for each language L:
MIN(L) =
{x |x
is
in
L and no w
in
L
is
a proper prefix of
x},
and
MAX(L) =
{x |x
is
in
L and x
is
not a proper prefix of any word
in L}.
7
10.4
Example
J
ADDITIONAL CLOSURE PROPERTIES OF DCFl/S
|
Let
10.3
L= {0W|i,y> k > 0, + ; >
/c}.
i
Then MIN(L) Theorem Proof scans
resulting
=
10.3
Let its
00*11*0, and
If
L
M = (Q M
MAX(L) =
DCFL,
a
is
M
DPDA
x
then
+ {0 P0' l
Thus
(10.1)
Conversely, of
if
MIN(L) and MAX(L)
,
,
,^Z0 = l Q w I W '"\w lm \
)
w
l
Z0 = )
70
m_
,
7c(M, A). Let
£=
is is
.
.
,
7m _
Z+
{(
[Z, ju])|g
.
Let
in L(M). an accepting computation
Thus w is in MIN(L). A = (Q A Z, S A p 0 F A be = (Q M Z, T x A, # 0 [Z 0 ju 0 ], state.
,
DCFL. But
L,
Z0
{*l(<7o,
some q
does
(
p0 )
(q,
= MAX(L),
)
&-(
in
F M and ,
w, y) f*-
so
for
(5, c, /?)
MAX(L)
Example {0'
not
in
,
Then by
is
a
now^t
for 5 in r
10.4
F2*
\k
Let us use
<
i
or k
Theorem
< j). Then L
10.3 to x
is
a
show a
CFL
A^0A\e l£2|l£|<:
C->0C2|0C|D
D
17)|c
in
I
A.
CFL
not to be a
generated by the
S^AB\C B-
M}
DCFL.
The automaton
Fig. 10.1
=
is
)
,
,
y)
Any symbol
^
,
(5,
,
F M and
in
is
M
y, and q is in has an accepting
j
the predicting machine. Let
for
a
w
7m
.
10.5,
=
is
• • •
has an accepting
x
the simple F,4 of Fig. 10.1 accepting
Lemma
^ ^
Ix
I l9
some
y) for
£,
(q 9 ,
M
(g 0 , w,
(10.1)
\
in
is
Mj, then none of 7 0 7j, For MAX we must use
F M ) be
DCFL's.
are
,
M, where I m = MIN(L), none of 7 0 also a computation of l9 so
is
0}.
M
Furthermore, since
state.
j
i, |
be the sequence of ID's entered by .
>
*
<5 M g 0 Z 0 F M ) be a DPDA that accepts L and always Modify to make no move in a final state. Then the accepts MIN(L). In proof, if w is in MIN(L), then let
Z, T,
,
entire input.
(q 0
FM
245
DCFL.
grammar
Let
DETERMINISTIC CONTEXT-FREE LANGUAGES
246
Suppose Lj were a DCFL. Then L2 = MAX^) would be a DCFL and hence a = max(i, j)}. Suppose L2 were a CFL. Let n be the pumpCFL. But L 2 = {0'P2* ing lemma constant and consider z = uvwxy = (Tl^". If neither v nor x has a 2, 2 2 then z' = uv wx y has 2's and at least (n + 1) O's or at least (n -f 1) l's. Thus z' /c
|
would not be
Now
symbol, then either 0 or l's
and
L2
in
as supposed.
consider the case where vx has a
=
z'
1 is
2
wx 2 y
L 2 We .
is
2. If
not of the form
x has more than one and would not be L 2 Thus
either u or
Qfl J2
k
.
Hence uwy has fewer than n 2's but has n O's or n conclude L 2 is not a CFL, so Lj is not a DCFL.
not present
not in
is
wu
in vx.
Other closure properties
As a general
rule,
only those closure properties of CFL's mentioned in Section 6.2
that were given proofs using the particular,
Theorem
we can
PDA characterization carry over to DCFL's. In
state the following.
The DCFL's are closed under
10.4
(a) inverse
homomorphism, and
(b)
intersection with a regular set.
Proof
Theorem (c)
10.5
DPDA's.
and
6.5
work
The DCFL's are not closed under
(a)
homomorphism,
The arguments used
in
Theorems
6.3
for
(b) union,
concatenation, or (d) Kleene closure.
Proof
See Exercise 10.4 and
its
solution.
DECISION PROPERTIES OF DCFL's
10.5
A number Theorem
of problems that are undecidable for
Let
10.6
L be
a
CFL's are decidable
for
DCFL's.
DCFL and R a regular set. The following problems are
decidable. 1)
Is
2) Is
L=
K?
Rc=L?
4) Is
L= 0? La CFL?
5) Is
L
3) Is
regular?
Proof 1)
= (L n R) u (L n R) is empty. Since the DCFL's are if L under complementation and intersection with a regular set, and since the CFL's are effectively closed under union, Lj is a CFL, and L= R
if
and only
x
effectively closed
emptiness for CFL's 2)
R cL
if
decidable.
and only
is if
decidable.
L n R = 0.
Since
Ln R
is
a
CFL,
Ln
R=
0
&
10.5
DCFL's
3) Since the
L=0 The property L is and hence
4)
DECISION PROPERTIES OF DCFL'S
are effectively closed under complementation, is
a
247
L is a DCFL
decidable.
CFL
DCFL's
5) Regularity for
|
is
trivial for
is
decidable.
DCFL's and hence
The proof
is
is
decidable.
lengthy and the reader
is
referred to Stearns [1967] or Valiant [1975b].
Undecidable properties of
DCFL's
Certain other properties undecidable for CFL's remain so even the
DCFL's.
Many
tations of a Turing
and L2 of Section 8.6, whose machine M, are DCFL's.
Theorem
L and L
that the languages Lj
Let
10.7
when
restricted to
of these problems can be proved undecidable by observing intersection
is
the valid
compu-
be arbitrary DCFL's. Then the following problems
are undecidable. 1) Is
2) Is 3) Is
4) Is 5) Is
Ln
L = 0? LciL? Ln L Ln L Lu L
a a
a
DCFL? CFL? DCFL?
M
Given an arbitrary TM we showed in Lemma 8.6 how to construct L and L 2 such that L n L2 = if and only if L(M) = 0. It is easy to show that Lj and L2 are DCFL's by exhibiting DPDA's that accept them. Thus (1) follows immediately from the fact that it is undecidable whether L(M) = 0. Since DCFL's are closed under complement, and L c E if and only if L n L = 0, (2) follows from (1). To prove (3), (4), and (5), modify each TM to make at least two moves before accepting, as in Lemma 8.8. Then L x n L 2 is either a finite set (in which case it is surely a CFL and a DCFL) or is not a CFL depending on whether L(M) is finite. Thus decidability of (3) or (4) would imply decidability of finiteness for L(M), a known undecidable property. Since DCFL's are closed under complementation, deciding whether L u L is a DCFL is equivalent to deciding if L n L is a DCFL. Thus (5) follows from (3). Proof
languages
x
0
x
M
Theorem Proof
10.8
Let
L
Let
be the
makes
at least
only
M accepts a
if
L be an arbitrary CFL. It
CFL
is
undecidable whether
of invalid computations of an arbitrary
two moves on every
input.
L is
regular and, hence, a
L
is
a
DCFL.
TM M DCFL
if
that
and
finite set.
Finally we observe that the question of whether two an important unresolved problem of language theory.
DCFL's are equivalent
is
DETERMINISTIC CONTEXT-FREE LANGUAGES
248
GRAMMARS
LR(0)
10.6
DCFL's
Recall that one motivation for studying
their ability to describe the
is
syntax of programming languages. Various compiler writing systems require syntation of
form of restricted CFG's, which allow only the represenDCFL's. Moreover, the parser produced by such compiler writing
systems
essentially a
tactic specification in the
CFG
is
DPDA.
In this section
we introduce a
restricted type of
an LR(0) grammar. This class of grammars is the first in a family collectively called LK-grammars. Incidentally, LR(0) stands for "left-to-right scan of the input producing a rightmost derivation and using 0 symbols of lookahead on the input." called
The LR(0) grammars (L
define exactly the
said to have the prefix property
is
if,
DCFL's having
whenever w
is
in
L,
the prefix property.
no proper
prefix of
w
is
Note that the prefix property is not a severe restriction, since the introduction of an endmarker converts any DCFL to a DCFL with the prefix property. Thus L$ = {w$ w is in L} is a DCFL with the prefix property whenever L is a in L.)
|
DCFL. While the LR(0)
grammars flavor of
for
its
restriction
too severe to provide convenient and natural
is
many programming
more
languages, the LR(0) condition captures the
useful generalizations,
which have been successfully used
which we discuss
in several
in Section 10.8,
and
parser-generating systems.
L/?-items
To
introduce the LR(0)
item for a given
CFG
grammars we need some preliminary definitions. First, an is a production with a dot anywhere in the right side,
including the beginning or end. In the case of an e-production,
B ->
e,
B ->
•
is
an
item.
Example
10.5
We now
introduce a
grammar
that
we
shall use in a series of
examples.
A^aSb\ab
S-+SA\A
S'^Sc
(10.2)
This grammar, with start symbol S\ generates strings of "balanced parentheses," treating a and b as left and right parentheses, respectively, and c as an endmarker.
The items
for
grammar •
S
(10.2) are
Sc •
c
S'- Sc-
S^SA S^S- A S^SA •
A
S^A-
A^- aSb A-+aSb A^aS-b A^aSb-
A^ab A^a-b A
->
ab
•
10.6
we
In what follows,
and
tions
form
GRAMMARS
CFG G
a right-sentential form y for
and dfiw
=
P within
y
That
y.
a handle of 7
is,
is
a substring
is
A
right-sentential
A handle of
such that
y.
Note
that in this context, the position of
important.
form
viable prefix of a right-sentential
right than the right
Example
/?,
a substring that could be introduced at the
is
a rightmost derivation of
last step in
249
use the symbols => and => to denote rightmost deriva-
single steps in a rightmost derivation, respectively.
a sentential form that can be derived by a rightmost derivation.
is
A
LR(0)
|
In
10.6
end of a handle of
grammar
(10.2) there S'
=> Sc =>
y
is
any
prefix of y
ending no farther
y.
is
a rightmost derivation
SAc => SaSbc.
Thus SaSbc is a right-sentential form, and its handle is aSb. Note that in any unambiguous grammar with no useless symbols, such as grammar (10.2), the rightmost derivation of a given right-sentential form unique. Thus prefixes of
We
we may speak
SaSbc are
say an item
e y S,
A
Sa, SaS,
-> a
•
is
unique, so
p
is
handle
its
The
of "the handle*' rather than "a handle."
is
viable
and SaSb.
valid for a viable prefix y
if
there
is
a rightmost
derivation
SZ>dAwj>daPw and
<5a
=
y.
Knowing which
rightmost derivation is
items are valid for a given viable prefix helps us find a
in reverse, as follows.
the rightmost symbol in the item. If A
appears that
A
right-sentential
-> a could
form
An
-a
•
item is
have been used at the
in the derivation
of
is
said to be complete
a complete item valid for last step
the dot
if
y,
then
it
and that the previous
yw was 3 Aw.
Of course, we cannot more
than suspect this since A^ol- may be valid for y because of a rightmost derivation S ^> 5 Aw' => yw'. Clearly, there could be two or
more complete items valid for y, or there could be a handle of yw that includes symbols of w. Intuitively, a grammar is defined to be LR(0) if in each such situation 5 Aw is indeed the previous right-sentential form for yw. In that case, we can start with a string of terminals x that is in L(G) and hence is a right-sentential form of G, and work backward to previous right-sentential forms until we get to S. We then have a rightmost derivation of x. Example
10.7
Consider grammar S'
we
see that
A
ab
for viable prefix a,
•
is
(10.2)
h>
and the
->
-
ab
is
form abc. Since
Ac => abc,
valid for viable prefix ab.
and A
right-sentential
We also see that A - a
valid for viable prefix
e.
As A
•
b
valid
is
ab
•
is
a
DETERMINISTIC CONTEXT-FREE LANGUAGES
250
complete item, we might be able to deduce that Ac was the previous form for abc.
right-
sentential
Computing
sets of valid items
The definition of LR(0) grammars and the method of accepting L(G) for LR(0) grammar G by a DPDA each depend on knowing the set of valid items for each viable prefix prefixes
is
turns out that for every
y. It
CFG G
whatsoever, the set of viable
and this regular set is accepted by an NFA whose states G. Applying the subset construction to this NFA yields a DFA
a regular
are the items for
set,
whose state in response to the viable prefix y is the set of valid items for y. recognizing the viable prefixes for CFG G = (K, T, P, S) is The NFA = (Q, V u T, <5, q 0 Q\ where Q is the set of items for G defined as follows. Let plus the state q 0 which is not an item. Define
M
M
,
,
1)
2)
8{A
3)
S(A
Rule dot.
(2)
0=
-a
{S
-
BP,
ol
*
e)
a
a |S
=
{B
a production},
is
- y B-y - aX 0}. •
\
Xfr X) = {A
allows expansion of a variable
Rule
(3)
is
a production},
•
B appearing immediately to the right of the
permits moving the dot over any
grammar symbol
X if X
input symbol.
Example
10.8
The
Fig. 10.2
NFA
NFA
for
grammar
(10.2)
is
shown
recognizing viable prefixes for
in Fig. 10.2.
Grammar
(10.2).
is
the next
10.6
Theorem
A
-+ a
The
10.9
if f!
•
LR(0)
|
GRAMMARS
NFA M defined above has the property that S(q 0
and only
if
A
-+ a
ft is
•
valid for
,
y)
251
contains
y.
Proof a ft contained in d(q 0 y) is valid for y. Only if: We must show that each item A proceed by induction on the length of the shortest path labeled y from q 0 to A -> a P in the transition diagram for M. The basis (length 1) is straightforward. The only paths of length one from q0 are labeled e and go to items of the form S -> a. Each of these items is valid for e because of the rightmost derivation S f> a. For the induction, suppose that the result is true for paths shorter than k and •
,
We
•
•
9
there be a path of length k labeled y from q0 to A a depending on whether the last edge is labeled e or not. let
The
case
1
state
A
-+ a'
valid for
edge
last
where y
y',
is
=
X in
labeled X, for
=
where a
AT/?,
•
ol'X.
T.
/?.
There are two cases
The edge must come from a
Then by
the inductive hypothesis,
there
a rightmost derivation
Thus
y'X.
V u
•
is
A
- a'
•
X/?
is
S%>dAw=>doc'Xpw, rm rm '
where
<5a'
=
valid for
is
This same derivation shows that
y'.
A
ol'X
/?
(which
-
valid
ft)
The last edge is labeled c. In this case a must be c, and A -> a /? is really Afi u and is also ft. The item in the previous state is of the form i^-x^ for y. Thus there is a derivation
case 2
A-*
a
is >4
y.
where
•
•
y
=
3oc
l
Let
.
Pi^>x S ^>
for
some terminal
SBw => &x,
,4/?i
string x.
Then
w ^> 5ol Axw => eta,
the derivation
tfxw
x
can be written
S ^> rm
Thus A //:
Suppose
•
P
is
a
,4
•
valid for
/? is
y,
(5a,1
as y
valid for
y.
=> dot, Axw rm
1
=
(5a
Bxw. "
P
Then
S^y^w^yja^w, =
(10.3)
-
a/?, then by rule (3) we ct y. If we can show that 3(q 0 y x ) contains A therefore prove by induction on the that S(q 0 , y) contains A-kx- p. length of derivation (10.3) that S(q 0 , y t ) contains /I a/?. The basis, one step, follows from rule (1). For the induction, consider the step
where y 1
,
•
We
know
•
in
s
S=>y Aw x
kyi Aw
in
which the
explicitly
shown A was introduced. That
as
S I?
y2 Bx 7% 72 73
Ay^x |> y 2 y 3 Ayx,
is,
write
DETERMINISTIC CONTEXT-FREE LANGUAGES
252
where
=
y2 y3
yx
and yx
=
w.
Then by the inductive hypothesis applied
to the
derivation
S^y 2 Bxj>y 2 y 3 Ay4 x we know and by
B
that
rule
•
(2),
Ay 4
y3
A
is
in 5{q Q , y 2 ).
a/? is in
•
<%0
,
By rule
y2 y3
).
9
(3),
B
y3
Since y 2 y 3
=
•
yj,
y4y 4
is
in
<5(<7 0
,
y 2 y 3 ),
we have proved
the
inductive hypothesis.
Definition of LR(0)
We
are
now
grammar
grammar
prepared to define an LR(0) grammar.
We
say that
G
is
an LR(0)
if
symbol does not appear on the
1) its start
2) for every viable prefix y of G,
whenever
right side of
A
a
•
is
any production, and
a complete item valid for
y,
then no other complete item nor any item with a terminal to the right of the
dot
is
valid for y.|
is no prohibition against several incomplete items being valid for y, as long no complete item is valid. Theorem 10.9 gives a method for computing the sets of valid items for any viable prefix. Just convert the NFA whose states are items to a DFA. In the DFA,
There as
the path from the start state labeled y leads to the state that for
y.
Thus construct
the
DFA
and inspect each
is
state to see
the set of valid items if
a violation of the
LR(0) condition occurs. constructed from the NFA of Fig. 10.2, with the dead and transitions to the dead state removed, is shown in Fig. 10.3. Of these states, all but 7 0 I u / 3 and 7 6 consist of a single complete item. The states with more than one item have no complete items, and surely 5", the start symbol, does not appear on the right side of any production. Hence grammar
Example state
DFA
The
10.9
(empty
set of items)
,
(10.2)
10.7
is
,
LK(0).
LR(0)
GRAMMARS AND DPDAs
show that every LR(0) grammar generates a DCFL, and every DCFL with the prefix property has an LR(0) grammar. Since every language with an
We now
LR(0) grammar
will
we have an exact and only if L$ has an
be shown to have the prefix property,
characterization of the
DCFL's; namely L
is
a
DCFL
if
LR(0) grammar. t
The only items
that could be valid simultaneously with
to the right of the dot,
conditions can be
shown
and
this
can occur only
to occur.
if
a
=
A e;
-* a
are productions with a nonterminal
otherwise another violation of the LR(0)
10.7
Fig. 10.3
DFA
whose
|
grammars and dpda's
jLR(O)
253
states are the sets of valid items.
DPDA's from LR(0) grammars The way in which we construct a DPDA from an LR(0) grammar differs from the way in which we constructed a (nondeterministic) PDA from an arbitrary CFL in Theorem 5.3. In the latter theorem we traced out a leftmost derivation of the word on the PDA's input, using the stack to hold the suffix of a left-sentential form beginning at the leftmost variable. Now we shall trace out a rightmost derivation, in reverse,
including
using the stack to hold a viable prefix of a right-sentential form,
all
variables of that right-sentential form, allowing the remainder of the
form to appear on the input. In order to clearly describe this process, for ID's of a
the
left.
To
PDA. We
distinguish the
parentheses:
[q, a,
w]
is
it is
picture the stack with
new notation from
our synonym
useful to develop a
its
the old
for (q, w,
r ol
new notation
top at the right end, rather than
).
we use brackets
rather than
,
DETERMINISTIC CONTEXT-FREE LANGUAGES
254
To simulate rightmost derivations
an LR(0) grammar not only do we keep
in
a viable prefix on the stack, but above every symbol we keep a state of the DFA "' recognizing viable prefixes. If viable prefix X 2 k is on the stack, then the
complete stack contents
will
be
X X x
'
•
-
-
x
there
X
k,
say
ol
A
•
•
X X X X s where s k
k,
top state
sk
a
•
,
then
A
valid for
X
X X
(note a
X i+1
=
some w such
is
•
is
f
3(q 0
,
X
• •
•
x
X
t
)
and 3
provides the valid items for
k.
contains
If s k
X
X
'
'
2
s1
1
DFA. The
the transition function of the
is
X
s0
that
k
a
X
•••
x
•
w
k
is
may
be
£,
in
X
x
Thus a
k.
which case
i
is
= k).
a suffix of
Moreover,
a right-sentential form, and there
is
a
is
derivation rm
1
rm
1
K
*
Thus to obtain the right-sentential form previous to X x Xk w in a right derivation we reduce a to A, replacing X i+ X k on top of the stack by A. That is, by a sequence of pop moves (using distinct states so the DPDA can remember what it is doing) followed by a move that pushes A and the correct covering state onto the • •
•
•
•
,
DPDA
stack, our
[q, s 0
where
s
sequence of ID's
will enter a
X
sk .
x
x
X
k sk ,
w]
\±- [q, s 0
X
A>
••• x
f
/ls, w],
(10.4)
= <5(s„ /I). Note that if the grammar is LR(0) sk contains only A a = c, in which case sk may contain some incomplete items. However, by •
9
unless a
the LR(0) definition,
are complete.
Thus
none of these items have a terminal to the right of the dot, or any y such that X x X k y is a right-sentential form, X •
for
•
-
•
• •
x
Xi Ay must be the previous right-sentential form, so reduction of a to A
is
correct
regardless of the current input.
Now
consider the case where
right-sentential -
-
contains only incomplete items. Then the a k w could not be formed by reducing
-
X
-
x
•
x
k
x
next input symbol onto the stack. That [q, s 0
where is
-
x
.
k
x
shift the
f
sk
X
X X k to some variable, else there would be a complete item valid for X There must be a handle ending to the right of X k in X • X k w, as X is a viable prefix. Thus the only appropriate action for the DPDA is to
suffix of
X X
form previous to
t
=
5(s k
,
X
•••
a). If
t
X • s^^^at, y], (10.5) X a a viable prefix. If not the empty set of items, X
sk .
x
is
is,
x
X
k sk ,
ay]\—[q,
s0
x
•
•
•
x
empty, we shall prove there
is
no
k
is
possible previous right-sentential form for
X X ay, so the original input not in the grammar's language, and the DPDA "dies" instead of making the move (10.5). We summarize the above obseris
k
x
vations in the next theorem.
Theorem
10.10
If
L is
L(G)
for
an LR(0) grammar G, then
L is N(M) for a DPDA
M. Proof
Construct from
G
the
DFA
G's viable prefixes. Let the stack
D, with transition function
symbols of
5,
that recognizes
M be the grammar symbols of G and
10.7
the states of D.
M has state
which
q,
LR(0)
|
GRAMMARS AND DPDA'S
along with the additional
start state,
is its
255
used to perform reductions by sequences of moves such as (10.4) above. We assume the reader can specify the set of states for each reduction and the estates
needed to
transitions
effect
We
a reduction.
M
specification of the transition function of
also leave to the reader the
needed to implement the moves
indicated by (10.4) and (10.5).
We
have previously indicated why,
way
possible
G
if
is
LR(0), reductions are the only
to get the previous right-sentential form
when
the state of the
DFA
M
on the top of Af's stack contains a complete item. We claim that when starts with w in L(G) on its input and only s 0 on its stack, it will construct a rightmost derivation for
a shift
w
in reverse order.
The only point
called for, as in (10.5), because the top
is
still
requiring proof
DFA state on
is
that
when
M's stack has only
incomplete items, then there could not be a handle among the grammar symbols k found on the stack at that time. If there were such a handle, then some x
X
X
DFA
on the stack, below the top, would have a complete item. Note that each Suppose there were such a state containing A a when it is first put on the stack either by (10.4) or (10.5), is on top of the state
.
Therefore
it
will
immediately
reduction of a to A. If a
call for
£,
then {A
state,
stack.
-a
•
} is
removed from the stack and cannot be buried. If a = £, then reduction of e to A occurs by (10.4), causing A to be put on the stack above X ••• X k In this case, X k occupies there will always be a variable above X k on the stack as long as X e at position k could not be the handle the bottom positions on the stack. But A of any right-sentential form X ••• X k f$, where ft contains a variable. One last point concerns acceptance by G. If the top state on the stack is {S -> a }, where S is G's start symbol, then G pops its stack, accepting. In this case .
t
•
•
•
t
t
•
we have completed Note
the reverse of a rightmost derivation of the original input.
S does not appear on the right of any production, it is impossible that there is an item of the form A - Sa valid for viable prefix S. Thus there is never a need to shift additional input symbols when S alone appears on the stack. Put another way, L[G) always has the prefix property if G is LR(0). that as
M finds a rightmost derivation of w, M accepts w, the sequence of rightsentential forms represented by the ID's of M provides a derivation of w from We have thus
reduces
w
to S,
proved that
and
if
w
is
in L(G),
accepts. Conversely,
if
S.
Thus N(M) Corollary
Proof
L(G).
Every LR(0) grammar
10.10
Consider the
corresponding to the
be aababbc. The of
is
The above argument shows
Example states
=
moves
DPDA
DFA
unambiguous.
that the rightmost derivation of w
of Fig.
M constructed as
listed in Fig. 10.4.
10.3.
sets of items / 0 , I u in
Let
0, 1,
.
.
.,
8 be the
/ 8 , respectively.
Theorem
10.10
makes
is
unique.
names of the Let the input the sequence
DETERMINISTIC CONTEXT-FREE LANGUAGES
256
Remaining input
Stack
Comments
1)
0 0a3
aababbc ababbc
Initial
2)
ID
Shift
3)
0a3a3
babbc
Shift
4)
Oa3a3bl
abbe
Shift
5)
0a3A2
abbe
6)
abbe
Reduce by A -> ab Reduce by S -> A
bbc
Shift
8)
0a3S6 0a3S6a3 0a3S6a3b7
be
Shift
9)
0a3S6A5
be
be
Reduce by A -> ab Reduce by S -* SA
11)
0a3S6 0a3S6b%
c
Shift
12)
0A2
c
13)
0S\
c
Reduce by A -> aSb Reduce by S -> A
14)
0S\c4
7)
10)
Shift
Accept
15)
Fig. 10.4
Sequence of moves of
DPDA
M.
in line (1), state 0 is on top of the stack. There is no complete we shift. The first input symbol is a, and there is a transition from / 0 to / 3 labeled a. Thus in line (2) the stack is 0a3. In line (9), 5 is the top state. / 5 consists of complete item S -> SA. We pop SA off the stack, leaving 0a3. We then push S onto the stack. There is a transition from / 3 to / 6 labeled S, so we cover S
For example,
item in set
by
6,
/0 ,
so
0a3S6
yielding the stack
LR(0) grammars from
in line (10).
DPDA's
—
begin our study of the converse result if L is N(M) for a DPDA M, then an LR(0) grammar. In fact, the grammar of Theorem 5.4 is LR(0) whenever is deterministic, but it is easier to prove that a modification of that grammar is LR(0). The change we make is to put at the beginning of the right side of each production a symbol telling which PDA move gave rise to that production. = (g, Z, T, <5, q Qj Z0 0) be a DPDA. We define grammar Formally, let
We now L has
M
M
GM =
,
L(G M ) = N(M). V consists of the symbol 5, the q and p in Q and X in T, and the symbols A qaY for q in Q, a in T. S and the [qXp] s play the same role as in Theorem 5.3.
(V, Z, P, S) such that
symbols [qXp] Z u {e} and Y
for
y
in
Symbol A qaY indicates that the production is obtained from the move of The productions of G M are as follows (with useless symbols and
3(q, a, Y).
ductions removed). 1)
S->[q 0 Z 0 p]
2) If 3(q, a, Y)
for all
=
(p, e),
p
in Q.
then there
is
a production [qYp] ->
A qaY
.
M
in
pro-
10.7
3) If 5(q, a, Y)
=
(p u P3» •••» Pfc+i there
X X2 X
X
'"
k)
For
and
all q, a,
Y,
Suppose
then for each sequence of states p 2
1,
,
X
l
p 2]
•••
[ph
X kPk+l l
A qaY -> a.
Consider a rightmost derivation state p.
>
for k
257
a production
is
[qYPk + i]-+A qaY [p 1 4)
GRAMMARS AND DPDA'S
LR(0)
|
for the sake of
GM
in
.
,
Z 0 p] for some XYZ). Then the
with S=>[q 0
It starts
argument that d(q 0
Z0 =
a,
)
(r,
only productions for [q 0 Z 0 p] that derive strings beginning with a (a may be e) have right sides A qoaZo [rXs][sYt][tZp] for some states s and t. Suppose that the
some
rightmost derivation eventually derives
—
Y)
S(s, b,
VW), we might continue
(u,
S g>
Now
A qoaZo [rXs][sYt]w
consider the moves
string
w
from
[tZp].
Then,
if
the rightmost derivation as
^, oflZ o[^^]^y["^][^r]w.
M
made by
(10.6)
The input
before reading input w.
corresponding to derivation (10.6) is of the form ax t bx 2 x 3 w where [rXs]^>x lt [uVv] ^> x 2 and [vWt] ^> x 3 The corresponding sequence of moves is of the form| y
.
,
(q Q ,
ax bx 2 x$w, x
Z0
)
|
—
(r,
x bx 2 x 3 w, x
p-(s, bx 2 x 3 w
x 2 x 3 w,
y
XYZ)
YZ)
VWZ)
h-(r, x 3 w, VKZ) h-(^,vv,Z). If
we compare
and
(10.6)
which remain on the stack
two
(with (10.6).
The
(10.7)
at the
we note
end of
(10.7)
that stack symbols (Z in particular)
(10.7) are the
symbols that do not appear
states attached in a bracketed variable) in the longest viable prefix of
stack symbols
popped from the stack
in (10.7),
namely
X
y
V,
are the symbols that appear in the viable prefix of (10.6). This situation
and W, makes
symbols at the left end of a sentential form derive a prefix of a and that prefix is read first by the PDA. In general, given any viable prefix a of G M we can find a corresponding ID / of in which the stack contains all and only the stack symbols that were introduced in a rightmost derivation of some aw and later replaced by a string of terminals. Moreover, / is obtained by having read any string derived from a. In sense, since the
sentence,
,
M
the case that
M
M
is
deterministic,
sentential forms with prefix a
on derivations into
Lemma
10.6
If
M
have a
restrictions is
a
we can argue
DPDA
specific
on the
and
t
Note
that
we have
set of items for a.
GM
above, then whenever [qXp] ^> w, there
that the derivations of right-
form and translate these limitations
is
is
the
grammar
constructed from
a unique computation (q w,
reverted to our original notation for ID's.
y
X)
M as
|-*-
(p, £,
,
DETERMINISTIC CONTEXT-FREE LANGUAGES
258
e).
"correspond" to a is
M corresponds to the reverse of the
Moreover, the sequence of moves made by
A^'s
sequence in which subscripted
move
in
which the
are replaced by state
Y is on
is s,
where A^y
a,
is
deemed
to
top of the stack, and input a
used.
The
Proof
was proved
existence of such a computation
uniqueness follows from the fact that
dence between the moves of
M
in
Theorem
To show
deterministic.
is
5.3. Its
the correspon-
M and the reverse of the sequence of expansions of
we perform an easy induction on the length of a derivation. The key portion of the inductive step is when the first expansion is by rule (3):
subscripted A's,
[qXp]=>A qaX [p 1 X p 2 ][p 2 X 2 p 3 ] ••
[p k
l
Then
the explicitly
shown A qaX
will
be expanded after
X
all
k
p].
subscripted A's derived
from the other variables are expanded.
As the
move
first
M
of
(q,
w,
X)\—(p u w\X
l
X2
-
-
X
k\
where w = aw\ corresponds to A qaX we have part of the induction proved. The remainder of the induction follows from observing that in the moves of M, X u X 2 .., X k are removed from the stack in order, by using inputs w u vv 2 .., w where WjWj ••• wk = w', while in the rightmost derivation of w from [qXp\ the derivation of w from [p X p 2 ] follows the derivation of w 2 from [p 2 X 2 p 3 ], and so on. Since all these derivations are shorter than [qXp] => w, we may use the inductive hypothesis to complete the proof. ,
,
,
.
Now,
G M let us fix on a particular string wqXp homomorphism from the variables of G M to L*
each variable [qXp] of
for
derived from [gXpJ.t Let h be the defined by
K A qay) =
,
H[qXp])
<*,
= W qXp
Let N(A qay ) = 1 and N([qXp]) be the number of moves sponding to [qXp]%> w qXp Extend N to V* by N(B l by the Finally, let us represent a move S(q, a, Y) of .
M
homomorphism from V*
to
.
in the
computation corre-
B 2 - Bk ) = JJ-i triple (qaY).
Let
"(H
m be the
moves defined by
m(A qaY )=(qaY);
1)
m([qXp])
2)
is
the reverse of the sequence of subscripts of the A's
derivation of
of moves
We Lemma G M y is ,
can
(g,
w qXp from [qXp]. By Lemma w qXpy X) p- (p, 6, e).
now complete our
Let y be a viable prefix of
10.7
in V*).
Then
(q 0 ,
h(y),
We
assume
GM
10.6,
m([qXp])
is
expanded
in the
also the sequence
characterization of LR(0) grammars.
Z0
^L
J )
\
GM
(p, e y
of moves m(y). t
fc ,
x
{
x
.
has no useless symbols, so
wqXp
exists.
(Note that by the construction of P) for some p and 0, by the sequence .
10.7
As y
Proof
is
sentential form.
a viable
prefix, there
Then
some
for
N(y) expansions of A's yy
is
Z0
)
H*- (r,
and the
e, e),
some y
is
r,
[q0
I* such
in
that yy
Z 0 r] => h(y)y. By Lemma
259
a right-
is
10.6, the last
form
in that derivation take place after the right-sentential
Lemma
reached. Also by
state
GRAMMARS AND DPDA'S
LR(0)
|
first
a unique sequence of moves (q 0 h(y)y, N(y) of these must be m(y). 10.6, there is
,
We are now ready to show that G M is LR(0). Since the start symbol obviously does not appear
A
q(l
*
Y
Lemma
0 for a 10.8
A qaY -* a
in
'
Proo/ case
any
in
B -> /?
complete item
We
in £.
If /
a
is
right side,
suffices to
it
show
prove these
set of items of
two lemmas.
facts in
GM
B
and
,
If
•
ft
in /, then there
is
is
no item
/.
Let / be the set of items for viable prefix
1
that each set of items with a
contains no other complete item and no item of the form
•
B ->
a production from rule
is
/?
y.
then y
(1),
=
/?,
Z
and
is
y
a single
variable [q 0 0 p], since S appears on no right side. If A qaY -* a is valid for y, then there is a derivation S ^> yA qaY y j> yay. However, no right-sentential form begins •
Z
0 p] unless it is the first step of a derivation; all subsequent right-sentential forms begin with a subscripted A, until the last, which begins with
with a variable [q 0
A qaY
a terminal. Thus y could not be followed by
case 2
If is
y'f$A qaY
when B (3),
or
(4),
case 3
If
B -> ft
*
^
so
ft
could not appear intact
B-P
A pbz -> b
is
c,
valid for
y, it
which
h{y)a are both m(y)(p£Z)
Lemma 10.9 item C -> a •
Proo/
then
we can again argue
If / is
a right-sentential form followed by
first
and A qaY
(4),
-
•
a
that
N(y)
a viable prefix. Thus, by
is
+
1
moves made by
A qaY
valid for
is
a viable prefix, has a terminal
is
follows that y^4 pfZ ,
in the first
in
introduced by rule
else y/l 9fl y,
applied to yA qaY and yA pcZ the
(Note that
(2) or (3),
a viable prefix, where y = y'p. However, in any rightmost derivation, /? is applied, the last symbol of /? is immediately expanded by rules (2),
then b must be
A P ez
introduced by rules
is
a right-sentential form.
in
in
it.
Lemma
.
y,
As
10.7
M when given input
and m(y)(qaY), contradicting the determinism of M. is not consumed.)
of these sequences, a
a set of items of G M and ,
B
•
ft
is
in /,
then there
is
no other
in /.
Again
let
y be a viable prefix with set of valid items
/.
Neither B -> /? nor C 1 a is a production introduced by rule (4). Then the form of productions of types (2) and (3), and the fact that productions of type (1) are applied only at the first step tell us that as a and ft are both suffixes of y, we must have p = a. If these productions are of type (1), B = C = 5, so the two items are really the same. If the productions are of type (2) or (3), it is easy to check that
case
B=
C.
and
C
For example,
if
a
are each [qYp] for
the determinism of
=
/?
= A qaY
some
p.
,
then the productions are of type
But rule
M assures that p
is
(2) requires that d(q, a,
unique.
Y)
(2),
=
and
(p, f),
B so
DETERMINISTIC CONTEXT-FREE LANGUAGES
260
B - P and C -> a are type (4) productions. Then yB and yC are viable Lemma 10.7 provides a contradiction to the determinism of M. That is, if f$ = a = €, then the first N(y) + 1 moves of M on input h(y) must be m(yB) and — b^e, then a = and the first N(y) + 1 must also be m(yC). If /? = a 6 and moves of M on input /i(y)a must be m(yB) and m(yC). If ft = a e and a = £, then the first N(y) + 1 moves of M on input h(y)a provides a similar contradiction. case 2
and
prefixes,
ai
B ->
case 3
Then yC
from rule
/? is
C -> a
or (3) and
(1), (2),
a right-sentential form, and y ends as in cases (1) and (2) of Lemma 10.8.
Theorem
10.11
If
M
is
a
DPDA,
Immediate from Lemmas
Proof
We
now complete our
can
Theorem
10.12
A
language
L
then
GM
and
10.8
is
is
from rule
(4),
or vice versa.
We can rule out this possibility
in
is
an LR(0) grammar.
10.9.
characterization of LR(0) grammars.
has an LK(0) grammar
if
and only
if
L is
a
DCFL
with the prefix property.
Proof is a DCFL with the prefix property. Then L is L(M') for a DPDA make M' accept L by empty stack by putting a bottom-of-stack marker on M' and causing M' to enter a new state that erases the stack whenever it enters a final state. As L has the prefix property, we do not change the language accepted, and L is accepted by empty stack by the new DPDA, M. Thus L= L(G M and the desired conclusion follows from Theorem 10.11.
L
Suppose
If:
Af '.
We
can
),
Only
if:
Theorem
10.10 says that
construction of Theorem 5.2 to
L
L
show
is
N(M) for a DPDA, M. We may use the L is L(M') for a DPDA M The fact that
that
'.
has the prefix property follows from the fact that a
empties
L$ has LR(0) grammar
Corollary
if
and only
surely has the prefix property. If
LS>
DCFL
by Theorem
DPDA
for LS.
It is
when
it
if
L
is
a
DCFL, where
$
is
not a
L's alphabet.
Proof
10.8
"dies"
stack.
its
symbol of
DPDA
LR(k)
10.2.
Conversely,
if
L
is
LS a
is a DCFL, then L = L$/$ is DCFL, it is easy to construct
a a
GRAMMARS
interesting to note that
if
we add one symbol of "lookahead," by determining
the set of following terminals
on which reduction by A
a could possibly be
we can use DPDA's to recognize the languages of a wider class of grammars. These grammars are called LR(l) grammars, for the one symbol of lookahead. It is known that all and only the deterministic CFL's have LK(1) performed, then
10.8
LR(k)
|
GRAMMARS
261
grammars. This class of grammars has great importance for compiler design, since they are broad enough to include the syntax of almost all programming languages, yet restrictive enough to have efficient parsers that are essentially DPDA's. It turns out that adding more than one symbol of lookahead to guide the choice of reductions does not add to the class of languages definable, although for any k, there are grammars, called LR(k), that may be parsed with k symbols of lookahead but not with k — 1 symbols of lookahead. Let us briefly give the definition and an example of LR(l) grammars, without proving any of the above contentions. The key extension of LR(0) grammars is that an LR(1 ) item consists of an LR(0) item followed by a lookahead set consisting of terminals and/or the special symbol $, which serves to denote the right end of a string. The generic form of an LR(1) item is thus
A-+cc
We
A
say LR(l) item
-> a
S => SAy j>
derivation i)
a
is
the
ii)
y
—
e
Also,
A
-> a
•
/?,
,
{a} is valid for viable prefix y
ft,
symbol of
first
and a
•
where
Satfty,
a 2 ...,a n }.
0, {a u
Sol
—
and
y,
if
there
is
a rightmost
ft,
{a,} is
either
or
y,
is $.
{a u a 2
.
,
.
.
,
an]
is
valid for y
if
for
each
A
i,
a
•
valid
for y.
Like the LR(0) items, the
NFA, and we
recognizing
by converting
this
NFA
LR(l) items forms the
set of
can compute the
DFA. The
to a
set
states of a viable prefix
of valid items for each viable prefix
NFA
transitions of this
are defined as
follows. 1)
There {a u a 2
2)
There
B -> if
3)
y
a transition on
is
X
from
A
-> a
•
Xft, {a ly a 2 ,
a transition on
is
is
e
a production and
and only
if
from
T is
..,
a n } to
A
olX
A
-> a
•
Bp, {a u a 2
ft
ii)
ft
^>
and b
is
a for
There is an initial production S -> a. 10.11
x
some
1
< < i
ft,
.
.
.,
a n } to
B -*
•
b,
if
T
or
on
e
to
S->-
a, {$}
for
each
Consider the grammar
A
- BA
|
£
B^aB\b
which happens to generate a regular set, (a*b)*. The NFA for grammar in Fig. 10.5, and the corresponding DFA is shown in Fig. 10.6. The Fig. 10.5 is unusual in that no two items differ only in the lookahead
shown
we may
T,
is in
y,
n.
state q 0 with transitions
S-+A
general,
•
either
derives a terminal string beginning with e,
,
the set of terminals and/or $ such that b
i)
Example
.
a n }.
,
see
two items with the same dotted production.
(10.8) (10.8)
is
NFA of sets.
In
Fig. 10.6
DFA
for
LR(l) items.
10.8
To
how
see
Fig.
10.5
is
form
A -> AB,
In rule (2) above, ft is e, T, so T = {$}. Now consider item
B-+
-
aB,
U
and B-+
U
b,
-
some
for
->
>4
I/.
->
T,
•
•
/?
,
263
•
7, but
T. Rule (2ii) tells us that $
is
A
=
and b, so a and b are in U. A also derives e, so $ 1/ because it is the lookahead set of A -> &4, {$}. Thus (7 = {a, b, $}. A grammar is said to be LR(1) if
derives strings beginning with a in
in
There are e-transitions to It is easy to check that A
2L4, {$}.
Here,
•
GRAMMARS
5 -> A, {$}. It has ewhat should T be?
constructed, consider item
and A so (2i) yields no symbols for
transitions to items of the
LR(k)
|
is
•
1)
the start symbol appears on
2)
whenever the plete item i) ii)
A
•
,
{a u a 2 ,
.
.
. ,
and
right side,
items I valid for
set of
-> a
no
a„},
some
some com-
viable prefix includes
then
no a appears immediately to the t
if
,
any
{b u b 2 ,
< <
1
i
n and
1
bk }
is
k.
right of the dot in any item of another complete item in 7, then a {
Example
7,
J=
and b} for
10.12 Consider Fig. 10.6. Sets of items I u 7 4 / 5 and I 6 consist of only one item and so satisfy (2). Set I 0 has one complete item, A->- {$}. But $ does not appear to the right of a dot in any item of 7 0 A similar remark applies to J 2 and 7 3 has no complete items. Thus grammar (10.8) is LR(\). Note that this grammar is not LR(0); its language does not have the prefix property. ,
,
y
.
The automaton is
that accepts an LR(l) language
allowed to use the next input symbol
move
,
consume its if we append $
that does not
input.
in
making
We can
is
its
like
a
DPDA, except that it
decisions even
if it
makes a
simulate such an automaton by an
DPDA to the end of the input. Then the DPDA can keep symbol or $ in its state to indicate the symbol scanned. The stack of our automaton is like the stack of the LR(0) grammar recognizing DPDA: it has alternating grammar symbols and sets of items. The rules whereby it decides to reduce or shift an input symbol onto the stack are: ordinary the next
1) If
the top set of items has complete item
reduce by
A
-> a
if
2) If the top set of items has an item
accept 3) If the
if
the current
top
set
symbol, then
A
-> a
the current input symbol
symbol
is $,
S
that
of items has an item
A
-> a is,
a
•
,
is
,
{a u a 2i
in {a u
{$},
where A
.
a 2t
.
.
.,
then reduce by S -> a and
the end of the input •
J= 5,
a n }.
aB, T, and a
is
is
reached.
the current input
shift.
Note that the definition of an LK(1) grammar guarantees that at most one of above will apply for any particular input symbol or $. We customarily summarize these decisions by a table whose rows correspond to the sets of items and whose columns are the terminals and $. the
264
DETERMINISTIC CONTEXT-FREE LANGUAGES
Example Fig. 10.7.
10.13
The
Empty
table for
grammar
entries indicate
(10.8), built
an error; the input
from Fig.
sequence of actions taken by the parser on input aabb
10.6, is
shown
is
shown
in Fig. 10.8.
on the stack represents set of items /, from Fig. 10.6. The proper items with which to cover a given grammar symbol is determined from the transitions (Fig. 10.6) exactly as for an LR(0) grammar.
number
i
b
a
h h h
Shift
Shift
Shift
li
Shift
Shift
u
Reduce by
B -*
Reduce by
B -* aB
in
not in the language. The
is
The
set
of
DFA
$
Reduce by A
Shift
->
c
Reduce by A
->
e
Reduce by B Reduce by A Reduce by B
-* b
Accept
Reduce by B
b
-* b
h
h
Fig. 10.7
Reduce by B-+ aB
Decision table for
grammar
0a3
aabb$ abb%
Oa3a3
bb%
Shift
Qa3a3M
b%
Shift
Oa3a3B6 Oa3B6 0B2
b%
b%
Reduce by B Reduce by B Reduce by B
0B2M
$
Shift
OB2B2 OB2B2A5 0B2A5 0A\
$
Reduce Reduce Reduce Reduce Reduce
0
(10.8).
Initial
Shift
b%
$
$ $
$ Fig. 10.8
BA aB
Comments
Remaining input
Stack
-> ->
by
-* b -> ->
B -*
by A
->
by A ->
by A
->
aB aB b c
BA BA
by S -+ A and accept
Action of LR(\) parser on input aabb.
EXERCISES 10.1
Show
that the
normal form of
Lemma
10.2 holds for nondeterministic
PDA's.
10.2 a)
b)
Show Show 10.2.
DCFL accepted by a DPDA whose only e moves are pop moves. DPDA of part (a) can be made to satisfy the normal form of Lemma
that every that the
is
c
EXERCISES
Give an
*10.3
Show
*S10.4
algorithm to implement rule
efficient
DCFL's
that the
Lemma
10.3.
are not closed under union, concatenation, Kleene closure,
homomorphism.
or
Show
**10.5
**10.6
{ww
R |
wis
in (0
+
{OT
b)
1)*}
|
n
>
1}
u
>
1}
n
{0
l
2n |
n
>
1}
Prove that
{WVa? DCFL,
a
DCFL's.
that the following are not
Sa)
is
(4) of
265
but
Show
10.7
is
that
|
Uj
>
not accepted by any if
L
is
a
DCFL,
1}
u
{0*1*2' U j |
DPDA
then
L
is
without t-moves. accepted by a
DPDA
which,
if it
accepts
Oia 2 '" a n does so immediately upon consuming a n (without subsequent amoves). [Hint: Use the predicting machine.] ,
10.8
Does Greibach's Theorem (Theorem
10.9
Construct the nonempty
8.14) apply to the
DCFL's?
sets of items for the following grammars.
Which
are
LK(0)?
-S
a) S'
S
->
aSa bSb
\
|
-S
b) 5'
S-+aSa\bSb\c S-* Ei E 3 E \T E 2 -+T3 E 2 \T2
Sc)
l
-T
7\ ->
l
l
a$ (E 2 $ |
T2 ->a)|(E 2 r3 ->0+ |(e 2 + )
10.10
Show
when
the input
10.11
DPDA
the sequence of stacks used by the is
a
+
+
(a
constructed from
grammar
10.9(c)
a)$.
Construct the nonempty
sets of
LR(l) items
for the following
grammars. Which are
LK(1)?
S^A
a)
A-+AB\t B-+aB\b S-+E E -> E + T\T
b)
T-+a\(E) for grammar 10.11(b). grammar with A -> a a ±
10.12
Repeat Exercise 10.10
10.13
Let
G
be an LR(0)
Prove that no other production can be valid
•
for
,
c,
valid for
some
viable prenx
y.
y.
Solutions to selected exercises
U
are
= {0T2' ij Let DCFL's. However, L
the
DCFL's
10.4
|
{
> 0} and L 2 = {tfl J2J ij > 0}. It is easy to show that u L 2 is the CFL shown not to be a DCFL in Example
are not closed under union.
L, and
\
10.1.
L2
Thus
DETERMINISTIC CONTEXT-FREE LANGUAGES
266
L3 =
u L2
DCFL
Then L 3 is a DCFL, because the presence or word in L or a word in L 2 Surely a* is a not a DCFL. If it were, then L 4 = a*L 3 n a0*l*2* would be a But L 4 = aL x u aL 2 If L4 is a DCFL, accepted by DPDA M,
then
x
For concatenation, absence of symbol a
let
DCFL. However a*L 3
flL t
.
us whether to look for a
tells is
by Theorem 10.4. we could recognize L u L 2 by simulating
.
M on (imaginary) input a and then on the
As Li u L 2 is not a DCFL, neither closed under concatenation. real input.
The proof for
closure
is
similar,
if
we
let
.
x
is
L5 =
L4 and
therefore the
,
{a}
DCFL's
are not
u L 3 L 5 is a DCFL, but L? is not, by .
a proof similar to the above.
For homomorphism, let L 6 = aL x u bL 2i which is a DCFL. Let L be the homomorphism that maps b to a and maps other symbols to themselves. Then h(L 6 ) = L 4 so the DCFL's are not closed under homomorphism. ,
Suppose
10.5(a)
L = {ww R w x
is
\
in (0
+
+ 1)
}
DCFL. Then by Theorem
were a
10.4,
so
would be
L 2 = L,n
(01)*(10)*(01)*(10)*.
Now L2 = By Theorem
10.3,
{(Ol)'(lOy(Oiy(lO)'
L 3 = MIN(L 2
L3 = since
if
j
>
i,
a prefix
is
in
L2
.
)
a
that
DCFL L4
is
by Theorem
not even a
10.4.
Let
i
and
!
;
not both
0}.
DCFL. But
L be
the
(L 3 )
<;
homomorphism
=
|0
h(a)
<; <
=
01 and h(b)
=
10.
Then
i}
However, the pumping lemma with
z
=
a
n+
l
n
n
b a b
n+
1
shows
CFL.
Before tackling this project,
10.9(c)
0,
{(oiy(ioy(oiy(io)'|o
U= is
a
is
>
let
us describe the language of the grammar.
We first
describe "expression" and "term" recursively, as follows. 1)
A
term
is
a single symbol a which stands for any "argument" of an arithmetic expres-
sion or a parenthesized expression. 2)
An
expression
Then ker,
$.
is
one or more terms connected by plus
the language of this
grammar
is
signs.
the set of all expressions followed by an endmar-
Ei and T, generate expressions and terms followed by a $. E 2 and T2 generate T3 generates a term followed by
expressions and terms followed by a right parenthesis, and a plus sign.
It turns out that LR(0) grammars to define arithmetic expressions, of which our grammar is a simple example, are quite contorted, in comparison with an LR( ) grammar for the same language [see Exercise 10.11(b)]. For this reason, practical compiler-writing systems never require that the syntax of a language be described by an LR(0) grammar; LK(1) grammars, or a subset of these, are preferred. Nevertheless, the present grammar will serve as a useful exercise. The DFA accepting viable prefixes has 20 states, not counting the dead 1
state.
the
We
DFA;
tabulate these sets of items in Fig. 10.9. Figure 10.10 gives the transition table for
blanks indicate transitions to the dead
state.
sets, namely 1, 3, 6, 7, 8, 1 1, 14, 15, 16, and 19, consist of a single complete item, while the remainder have no complete items. Thus the grammar is LR(0).
Inspection of the sets of items
17,
tells
us that certain
,
BIBLIOGRAPHIC NOTES
/5
/o
/10
-> E2 + r,-(-E 2 $
E,
(
'
£i
-
T,
£2-' 73E 2
T3
-*
*
d
IL 2
T3
->
•
(E 2
T, ->
•
a$
•
-f-
+
(E 2 $
Ti
—»
r3 T3 -> r2 -> T2 -
i2
i3
•
(E 2
•
a)
E
•
l
•
l
il
\^2^>
•
7i
-
£2
- T2
•
£2
(£ 2 )
T3 £ 2
-»
•
,
r
T3 - (£ 2 + T2 -» (£2 )
T3 -» a + T2 ^a) •
•
Tj ->
-f-
+
h
E, -> T3 Ei E! ->• T3 E!
T3 T3 ->
(£ 2
+
/
3
'2
Ei -»
T3 -
(E 2 )
•
-T E
r
a
*
h S-E,
•
£ 2 -* T3 £ 2 £2-- 7*3 £ 2 £ 2 -- r2
T3 ^-(£ 2 +
+
a
•
267
•
•
ii •
fl
+
•
(E 2
•
a$
+
Ti--(E 2 $ / '
17, 3
Ei -v
T,
Ti -*
•
'
8
a%
T3 -
•
(
/
13 •
£2 +
£2
- £2) - T3 £ 2
fi 2
-»
7i
(
7i -»(£,)•
•
•
'2
T3 -» a + r3 -(£ 2 + •
r2 -
U
T} - (£ 2 +
•
-a
(£ 2 )
19
T3 - a + Tj
•
-
•
$
r,
Fig. 10.9
- (£
2
•
$
Sets of items for Exercise 10.9(c).
BIBLIOGRAPHIC NOTES Deterministic
pushdown automata were
first
studied by Fischer [1963], Schutzenberger
Haines [1965], and Ginsburg and Greibach [1966a]. Lemma 10.3, the fact that DPDA's can be made to consume all their input, is from Schutzenberger [1963]; Theorem [1963],
10.1,
was observed independently by various people. Most and decision properties, Theorems 10.2 through 10.8, were first proved by
closure under complementation,
of the closure
Ginsburg and Greibach [1966a]. An exception is the fact that it is decidable whether a DCFL is regular, which was proved by Stearns [1967]. The predicting-machine construction is from Hopcroft and Ullman [1969b].
DETERMINISTIC CONTEXT-FREE LANGUAGES
268
a
0
(
)
Ei
$
4
2
4
2
1
2 3
4 5
12
13
11
10
16
11
10
18
11
10
6 7 8
9 10
14
15
12
13
12
13
11
12 13
17
14 Id
16 17
14
18
19
19 Fig. 10.10
Transition table of viable prefix recognizing
DFA.
LR(k) grammars and the equivalence of DPDA's to LR(l) grammars is from Knuth The latter work generalized a sequence of papers dealing with subclasses of the
[1965].
CFG's having efficient parsing algorithms. The history of this development is described in Aho and Ullman [1972, 1973]. Graham [1970] shows that a number of other classes of grammars define exactly the CFL's. Subsequent to Knuth [1965], a series of papers examined the class of LR(\) grammars for a useful subclass for
was the
first
which parsers of reasonable
size
could be
built.
Korenjak's [1969]
such method, although two subclasses of LR(\) grammars, called SLR(\)
(for
LALR(\) (for "lookahead LR"\ due to DeRemer [1969, 1971] are the methods used most commonly today. By way of comparison, a typical programming
"simple" LR) and
language, such as
ALGOL,
several thousand states,
has an LR(\) parser (viable prefix recognizing DFA) with and even more are needed for an LR(0) parser. As the transition
must be part of the parser for a language, it is not feasible to store such a large parser main memory of the computer, even if the table is compacted. However, the same languages have SLR(\) or LALR(l) parsers of a few hundred states, which fit easily with compaction. See Aho and Johnson [1974] or Aho and Ullman [1977] for a description of how LK-based parsers are designed and used. A good deal of research has been focused on the open question of whether equivalence is decidable for DPDA's. Korenjak and Hopcroft [1966] showed that equivalence isdecidable for a subclass of the DCFL's called "simple" languages.! These are defined by gramtable
in the
t
These are not related to "simple"
LR grammars
in
any substantial way.
BIBLIOGRAPHIC NOTES
mars
in
act and A^>ap exist. The grammars of Lewis and Stearns LR(k) grammars, by Rosenkrantz and Stearns
Greibach normal form such that no two productions A-+
decidability of equivalence [1968],
269
was extended
which are a proper subset of the
[1970]. Valiant [1973]
showed
Exercise 6.13 for a definition),
to the LL(k)
that equivalence
among
was decidable
for finite-turn
DPDA's
(see
other classes; see also Valiant [1974], Beeri [1976],
and Taniguchi and Kasami [1976]. Friedman [1977] showed that equivalence for DPDA's is decidable if and only if it is decidable for "monadic recursion schemes," which in terms of automata can be viewed as one-state DPDA's that can base their next move on the current input symbol, without consuming that symbol. Additionally, work was done on extending the undecidability of containment for DCFL's to small subsets of the DCFL's. The work culminated in Friedman [1976], which proved that containment is undecidable even for the simple languages of Korenjak and Hopcroft [1966]. A solution to Exercise 10.5(b) is found in Ginsburg and Greibach [1966a], and Exercise 10.6
is
based on Cole [1969].
CHAPTER
11
CLOSURE PROPERTIES OF FAMILIES OF
LANGUAGES
There are striking
among
similarities
context-free languages, the
r.e. sets,
the closure properties of the regular
and other
classes.
Not only
sets, the
are the closure
properties similar, but so are the proof techniques used to establish these properties.
In this chapter
we
take a general approach and study
having certain closure properties. This ing structure of closure properties
will
and
all families
of languages
provide new insight into the underly-
will simplify the
study of new classes of
languages.
11.1
AND FULL TRIOS
TRIOS
Recall that a language
is
a set of finite-length strings over some
finite
alphabet.
A
family of languages is a collection of languages containing at least one nonempty language. A trio is a family of languages closed under intersection with a regular set,
inverse
omorphism
homomorphism, and h
closed under
is
all
c-free if h(a)
11.1
full trios.
The
6-free c
for
homomorphisms,
tion with a regular set, then
Example
±
The
it
is
(forward)
homomorphism. [We
any symbol
a.] If
as well as inverse
said to be a full
say a
hom-
the family of languages
homomorphism and
is
intersec-
trio.
and the r.e. sets are and the recursive sets are trios but not full trios, since they are not closed under arbitrary homomorphisms. In fact, closing the CSL's or the recursive sets under arbitrary homomorphisms yields the r.e. sets (see Exercise 9.14 and its solution). regular sets, the context-free languages,
context-sensitive languages
270
AND FULL
TRIOS
11.1
TRIOS
271
I
Theorem
showed
3.3
that regular sets are closed under intersection; hence
they are closed under "intersection with a regular
completing the proof that the regular
sets
Theorems
CFL's are a
6.2, 6.3,
and
6.5
show
set."
homomorphisms and
closure of the regular sets under
that the
form a
Theorem 3.5 showed homomorphism, trio. The corollary to
inverse
full
full trio.
Exercise 9.10 and
solution provide a proof that the CSL's are closed under inverse intersection (hence intersection with a regular set),
homomorphism, but not all homomorphisms, Thus the CSL's are a trio but not a full trio.
We shall prove that
its
homomorphism,
and substitution (hence ofree e is not permitted in a CSL).
since
the recursive sets are a trio, leaving the proof that the
r.e.
as an exercise. Let h be a homomorphism and L a recursive _1 language recognized by algorithm A. Then /i (L) is recognized by algorithm B,
a
sets are
full trio
where w is B s input. Let g be an ofree homomorphism. Then g(L) is recognized by algorithm C which, given input w of length n, enumerates all the words x of length up to n over the domain alphabet of g. For each x such that g(x) = w, algorithm A is applied to x, and if x is in L, algorithm C accepts w. Note that since g is e-free, w cannot be g(x) if x > \w\. Finally, if R is a regular set accepted by DFA M, we may construct algorithm D that accepts input w if and only if A accepts w and accepts w.
A
which simply applies
y
to h(w),
|
|
M
We conclude Thus
sets.
by observing that every
this section
the regular sets are the smallest
the smallest trio. (A language
Lemma regular
Every
1.1
1
full trio
is
full trio.
e-free if e is
contains
all
not a
full trio
contains
all
regular
Also, the ofree regular sets are
member
of the language.)
regular sets; every trio contains
all e-free
sets.
Z an alphabet, and R c Z* an ofree regular set. Since (€ nonempty language, let L be in ^ and w be in L. Define Z' = {a a is in Z} and let h be the homomorphism that maps each a in Z to c and each a' in Z' to w. Then L = h~ (L) is in # because ^ is a trio. As w is in L, L contains all strings in Z'Z*, and others as well. Let g be the homomorphism ( g(a!) = g(a) = a for all a in Z. Then g being ofree, we know that L 2 = g(L ) is in € + depending on whether or not L contains e. Thus and is either Z* or Z Proof
J
% be
Let
a
trio,
contains at least one |
l
x
x
t
,
L2 n R = R
is
in
x
proving our contention that every
trio contains all ofree
regular sets. If
(
€
is
a
full trio,
we may modify
= a for all a in Z. Then L 2 = L2 n R = R is in <€.
g'(a)
the above proof by letting g'(a')
g'{L x )
=
Z*. If
R
is
any regular
set
=
c
and
whatsoever,
We leave it as an exercise to show that the ofree regular sets are a trio and hence the smallest trio. Note that they do not form a full trio, because they are not closed under all homomorphisms.
CLOSURE PROPERTIES OF FAMILIES OF LANGUAGES
272
GENERALIZED SEQUENTIAL MACHINE MAPPINGS
11.2
In studying closure properties, one quickly observes that certain properties follow
automatically from others.
Thus
to establish a set of closure properties for a class
of languages one need only establish a set of properties from which the others
we shall establish a number of closure properties that
follow. In this section
from the basic properties of
trios
and
follow
full trios.
The first operation we consider is a generalization of a homomorphism. ConMealy machine that is permitted to emit any string, including c, in a move. This device is called a generalized sequential machine, and the mapping it defines is called a generalized sequential machine mapping. sider a
More
formally a generalized sequential machine
(GSM) is a 6-tuple M =
(Q, Z,
q 0 F), where Q, Z, and A are the states, input alphabet, and output alphabet, respectively, 3 is a mapping from Q x Z to finite subsets of Q x A*, q 0 is the initial
A,
3,
,
F
and
state,
is
The
the set of final states.
interpretation of (p, w) in 3(q, a)
is
that
M
q with input symbol a may, as one possible choice of move, enter state p and emit the string w. in state
We
extend the domain of 3 to
1) 6{q, C)
=
{(q, £ )}.
For x
in
Z* and a
2)
3(q,
xa)
=
w)
{(p,
in
x Z* as
Q
Z,
w = Wj w 2 and
|
A is
a
GSM GSM
is
e-free
3
if
w
L
is
|
(P*
a language over Z,
y) let
{y|y
We
M(L)
say that
is
mapping. Note that
Also
t
',
in d(q, x)
) is
and
to finite subsets of Q x
(p,
A+
w2 ) .
is
in 3(p', a)}.
Let M(x), where
as defined above, denote the set {y
If
maps Q x Z
some p
for
(p',
M
follows.
GSM
a
L
is
x)
5( a o*
is i°
f° r
some p
in F}.
M(L) denote is in
M{x)
mapping.
for
If
M
some x is
in L}.
ofree, then
M(L)
is
an
t-free
GSM
a parameter of the mapping, not a given language.
let
M _1 (x) = {y|M(y) contains
x},
and
M~ We
M~
say that X
(M(L))
Example
M"
1
(L)
=
{y \x
an inverse
is
= M(M~
11.2
l
1
(L))
=
is
in
GSM M~
L, so
M(y)
for
some x
mapping. 1
is
It
is
in L}.
not necessarily true that
not a true inverse.
Let
M = (bo, 111
{0, 1}, {a,
fr},
5, *o, {
11.2
We define
{(?o.
%o,
{(?o. «)}.
1)
5(q u 0)
0,
%i>
{(?.. 0}-
1)
M emits the
M M has the choice of either emitting two or one goes to state q input to M, and M state g M can In state q M dies on a 0-input, but can remain state q with M,
a's
x
.
If
1 is
in
0,
in
l
1-input.
L={0n l"\n>
1}.
A GSM.
Then
M(L) = 0's are
is
l9
Fig. 11.1
For as
b)},
as a finite automaton, with an edge labeled a/w from state q above is shown in Fig. 11.1. (p, w). The diagram for
b, it
a.
no output on a
Let
(9u
a) contains
if <5(g,
only output an
«4
%o> 0)
Intuitively, as 0's are input to b. If
273
5 by
We may draw a GSM to state p
GENERALIZED SEQUENTIAL MACHINE MAPPINGS
|
read by
M,
it
{a
2rt
6|M>0}.
emits two as per
0, until at
some time
it
guesses that
it
should emit the symbol b and go to state q v If Fs do not follow immediately on the input, chooses to stay in q 0 when the first 1 is read, it can never dies. Or if
M
M
reach q { if the input is of the form CI". Thus the only output 2n ' 2 given input is a b.
made by
M when
0T
={a 2n b\n>0},
IfL,
M~ Note
that
M~
l
l
(L l )
=
then
{wOr
i
>
0 and
w
has an even
number of
l's}.
I
(M(L))^L.
GSM
mapping is a useful tool for expressing one language in terms of a The second language having essentially the same structure but different external trapn n n pings. For example, L x = {a b n > 1} and L 2 = {a"ba n > 1} in some sense have the same structure, but differ slightly. L x and L 2 are easily expressible in terms of \
\
each other by
and
GSM
Fig. 11.2(b)
mappings. Figure 11.2(a) shows a
shows a
GSM
mapping L2
to
Lv
GSM
mapping L t
to
L2
,
CLOSURE PROPERTIES OF FAMILIES OF LANGUAGES
274
(b)
Two GSM's.
Fig. 11.2
GSM mappings A key fact about GSM mappings Closure under
is that they can be expressed in terms of homomorphisms, inverse homomorphisms, and intersection with regular sets. Thus any class of languages closed under the latter operations is closed under GSM mappings.
Theorem
1
Every
1.1
GSM
under ofree
full trio is
GSM mappings. Every trio
closed under
is
closed
q 0 F) a
GSM.
mappings.
Proof Let % be a full trio, L a member of <€, and must prove that M(L) is in (€. Let
M = (g, I, A, d
y
,
We
Ai
and
h x and h 2 be the
let
h t ([q,
0, x, /?])
A* such 1)
=
the
=
contains
{[q, a, x, p] \$(q, a)
(p, x)}
homomorphisms from Af
a and h 2 ([q,
a, x, p])
=
R
Let
x.
to X* and Af to A* defined by be the regular set of all strings in
that first
component of
the
first
symbol
2) the last
component
3) the last
component of each symbol
of the last
symbol is
is
q0
,
the start state of
a final state of
is
the
same
as the
M;
M;
first
component of the
succeeding symbol. It
is
easy to check that
R
is
a regular
remembering the previous symbol
set.
A DFA
in its state
can verify condition
and comparing
it
(3)
by
with the current
symbol.
M(L) =
l
n
l
maps M's input encoded in it, for each symbol of the input string, a possible state transition of on the input symbol and a corresponding output string. The regular set forces the sequence of states to be a possible sequence of state transiIf
c
is
not
in
L, then
h 2 (hi
(L)
R).
That
is,
h\
to a string that has
M
tions of
M.
Finally, h 2 erases the input
and
state transitions, leaving only the
11.2
GENERALIZED SEQUENTIAL MACHINE MAPPINGS
|
275
output string. Formally, 1
hi
=
(L)
a u x u q x ][p 2 a 2 x 2 q 2 ]
{[Pi,
,
K
£=
Hence
/i
If £
in
*
make
sure
E = £u
s
since
{e},
£
• • •
|
and
in F,
is
fc
a*, x*,
lf
f
) is
in 3{p i9
a
£
)}.
a
{
a x a2
•
•
•
ak
contains (q h x,) for
)
all
f}.
definition.
not a
then h 2 (L)
final state,
=
so
£,
is
-1
in
Hence
(L).
/i
morphism, M(L) is in The proof for trios and Since the
GSM never emits
an
homomorphism.
GSM
£-free £,
x
the
GSM
Limited erasing and inverse
is still
we must modify
closed under intersection with a regular
£-free
x
(q i9
£
the resulting language.
in
is
is
g
then M(e)
state, e
in L,
M(L) by
L and q 0
a final
is
g0
2 (£)
is
ak
,
p 's are arbitrary, and
,
is
x k qk ]\a l a 2
[pk , a k9
yields
a l9 x l9 qi][q» a 2i x 2 q 2 ]
{[9o>
•••
,
in L, the
is
Intersecting fcf^L) with
,
/i
Let
2 (L')
=
M(L). But
E = h\
in
L and
(L) n (R + e). Then M(L). Since every full trio is l
homomorphism, and
set,
if £ is
the construction above to
inverse
homo-
mappings proceeds in a similar fashion. never £, and consequently h 2 is
in [q, a, x, p] is
mappings
Trios are not necessarily closed under
homomorphisms
that result in arbitrary
under certain homomorphisms that allow erasing, provided the erasing is limited. A class of languages is said to be closed under k-limited erasing if for any language L of the class and any homomorphism h such that h never maps more than k consecutive symbols of any sentence x in L to £, h(L) is in the class. The class is closed under limited erasing if it is closed under /c-limited erasing for all k. Note that if h(a) is e for some a, then whether h is a erasing.
However,
limited erasing
Lemma Proof
Let
on L depends on Every
11.2
/c-limited
trios are closed
h l {[a l a 2
Since [a x a 2 is
in
^
closed under limited erasing.
since It is
£* be a member of (€, and
h a
homomorphism
that
is
L. Let
Let h l and h 2 be
phisms.
is
^ be a trio, L £
on
I2 =
r
trio
L.
••• (
€
{[x]|xis
If, |x|
homomorphisms a m ])
a m] is
in
is
=
a l a2
only
in
I2
<
+
k
1,
and h(x)
c}.
defined by
- am and h 2 ([a a 2
a m ])
x
if
+
h(a l a 2
closed under £-free
easy to check that h 2 (hy
- am )
±
c,
h2
=
(L))
=
h(L).
all
1
fl
2
a m ). 1
Then
h 2 (h^ {L))
inverse
homomor-
is £-free.
homomorphisms and l
%
CLOSURE PROPERTIES OF FAMILIES OF LANGUAGES
276
Theorem
Every
11.2
trio
closed under inverse
is
GSM
mappings.
M
= (g, X, A, <5, g 0 F) be a GSM. and let Proof Let ^ be a trio, L a member of Without loss of generality assume that the sets X and A are disjoint. If not, replace symbols in A by new symbols and restore them at the end of the construction by an c-free homomorphism mapping each new symbol to the corresponding old symbol. Let h be the homomorphism mapping (X u A)* to A* defined by ,
x
,
f
=
,
h x (a)
=
Let L! b x b2
(L).
hi
bn
Let
Then
l
K
is
\a
for a in A,
|c
for a in Z.
the set of strings in
X*^ X*6 2
~
m
X*fr n X*, such that
L
is in
be the regular
set consisting
of all words of the form a x l a 2 x 2
•
• •
l
a m xm
such that
as
the
1 )
are in X,
2) the x's are in A*, 3) there exist states
contains (g
Note
x
f ,
f
q 0 ,q u
may be
that x,
.
.
.,q m such that q m
is
in
F and
for
< <
1
i
m,
1?
a,)
).
(.
The reader may easily show R to be a regular set by finite automaton accepting R. This NFA guesses
constructing a nondeterministic the sequence of states q u q 2 , Now L, n K is the set of
all
m>
the x's are in A*, x x 2
fl
i
where the as are
0,
«2
"
a m) contains
" "
(p,
qm
in Z,
x,x 2
•
.
words
R
in
of the form a x a 2 x 2 l
•
•
•
x
•
•
x m ),
for
some p
in F.
None
xm
k,
Finally, (
for
let
each b
/i
be the
2
homomorphism
is
in
%
Lemma
by
mapped
11.3
to
maps a
(p,
x)
is
and
is
a m x mi (5(^f 0 »
of length
in 6(q, a) for
to a for each a in X,
and b
to
Then
in A.
M' be
that
in L,
of the x/s
where k is the length of the longest x such that some p and q in Q and a in X. greater than
•••
l
is
1
1.2,
l
=
(L)
h 2 (L
since h 2 never causes
l
n
R)
more than
k consecutive symbols to
(.
OTHER CLOSURE PROPERTIES OF TRIOS
Trios and
full trios
are closed under
many
other operations. In this section we
present several of these closure properties.
Theorem Proof each a
11.3
Let in
Z
Every
(
£ be a
{
let a'
full trio is
full trio,
closed under quotient with a regular
L c X*
a
member
be a new symbol, and
let T.\
of %, and
be the
set
set.
R c I* a regular set. For of all such symbols. Let h t
11.4
ABSTRACT FAMILIES OF LANGUAGES
|
277
and h 2 be the homomorphisms from (Z x u li)* to Zf defined by h (a) = h x (ct) = a h 2 (a) = e and h 2 (a) = a. Then L/R = h 2 (h (L) n (I'i)*K), and hence L/R is in That is, h x x (L) is the words in L with each symbol primed or unprimed independently. Thus h\ l {L) n (Zi)*K is those words xy such that x consists only of primed symbols, y consists only of unprimed symbols, y is in K, and if z is x with the primes removed, then zy is in L It follows that x
l
9
x
n
l
h 2 (K {L) strings z as described above, that
is all
Theorem
Let
under substitution by regular
(
€ be a
that for each a in disjoint,
and
L/R.
Trios are closed under substitution by (-free regular
11.4
trios are closed
Proof
is,
(Z',)*R)
.s(a)
trio,
Z 1?
L c £* s(a)
and
full
of # and s: If -> Z£ a substitution such and I 2 are For the time being assume
member
a
regular.
is
sets,
sets.
does not contain
c.
homomorphism we can insert arbitrary By intersecting with a regular set we can symbol a is in s(a). Then by limited erasing
Let x be a string in L. By an inverse strings
from X| among symbols of
x.
assure that the string inserted after the
we can erase the symbols More precisely let h
=
of x, leaving a string from s(x). x
(L
\
x
u I 2 )*->Zf
be the
homomorphism
defined by
2^ and /^(a) = c for a in Z 2 and let h 2 (Zj u Z 2 )* -> Zf be the homomorphism defined by h 2 (a) = c for a "ml. and /? 2 (a) = a for a in Z 2 Then h x (a)
a for a
in
:
.
l
Now
is
a regular
set,
symbol, so s(L)
since each s(a) is
in
r
6
by
substitution by regular sets If
Zi and
Z2
Since s(a)
is.
Lemma is
1
1.2.
is
(-free,
The proof
/7
2
erases at
most every other under
that full trios are closed
may not be (-free. Z 2 by a new symbol, and homomorphism to restore the old sym-
identical except for the fact that s
are not disjoint, replace each symbol of
follow the above operations by an (-free bols.
11.4
ABSTRACT FAMILIES OF LANGUAGES
Many
of the families of languages
not implied by the trio or
full
we have studied have closure
trio operations.
properties that are
Predominant among these are
CLOSURE PROPERTIES OF FAMILIES OF LANGUAGES
278
union, concatenation, and Kleene closure. For this reason, two other sets of closure properties have
had
consequences heavily studied, and in fact were and full trio. Define a class of languages to be an abstract family of languages (AFL) if it is a trio and also closed under union, + concatenation, and positive closure (recall that L the positive closure of L, is class of is Call a languages AFL if it a a full trio and closed under £)• full {J?=l union, concatenation, and Kleene closure. For example, we proved in Chapters 3 and 6 that the regular sets and contextfree languages are full AFL's. The r.e. sets are also a full AFL, and we leave the proof as an exercise. The CSL's are an AFL, but not a full AFL, since they are not closed under general homomorphism (see Exercises 9.10 and 9.14). We saw that the regular sets are the smallest full trio. They are also a full AFL and therefore the smallest full AFL. The c-free regular sets are the smallest AFL, their
studied long before the trio
,
as well as the smallest trio.
The next theorem states that AFL's are closed under substitution into regular That is, for each symbol of an alphabet, we associate a language from an AFL Then replacing each symbol in each string in some regular set by the associated
sets. r
6.
language yields a language
in
Theorem 11.5 Let £ be an AFL that contains some language containing 6, and let R ^ I* be a regular set. Let s be a substitution defined by s(a) — La for each a in Z, where La is a member of Then s(R) is in (€. (
Proof
The proof is by induction on
the
number of operators
in
a regular expres-
sion denoting R. If there are zero operators, then the regular expression must be
one of 0, is
(,
or
a, for
in Hi. If the regular
by
Lemma
We claim
11.1. If
{(} is in
r
a
in
Z. If it
expression
is a, is
the result of substitution
the regular expression
6\
because some
is La which 0, which is in #
then the result of the substitution
0,
is c,
,
the result of the substitution
L containing
by closure under intersection with a regular
is
c is
in f6\
and
Ln
{(}
=
is {e}.
{c} is in
#
set.
The induction step is easy. AFL's are closed under union and concatenation, and we can show closure under * easily, given L in c6 containing c. That is, we ( already showed {c} is in %'. If L is any language in r6, then L\ is in &, so ( L\ = L\ u {(} is in £. Therefore, the AFL %' is closed under u, and *, from which the inductive step follows. Thus (€ is closed under substitution into a x
regular
set.
AFL's are not closed under substitution of languages in the family most of the common AFL's such as the CFL's, the recursive sets, and the r.e. sets are. However, any AFL closed under n is closed under substitution. The proof is similar to that of Theorem 1 1.5 and is In general,
into other languages in the family, although
left
as an exercise.
We
also leave as an exercise the fact that all AFL's, even those
with no language containing sets.
£,
are closed under substitution into e-free regular
11.6
SUMMARY
|
279
INDEPENDENCE OF THE AFL OPERATIONS
11.5
The
definition of an
AFL requires six
closure properties. However, to show that a an AFL, one need not show all six properties, since they are not independent. For example, any family of languages closed under kj, + e-free h, h~ and nR is necessarily closed under -.f Similarly, u follows from the other five operations and the same holds for nR. We shall only prove the dependence
family of languages
is
,
l
of-.
Theorem
Any
11.6
closed under
is
Proof
family of languages closed under u,
be such a family of languages, and
Let
We may tion
is
+ ,
c-free h,
h~ \ and
nR
•.
assume without loss of generality that justified by the fact that
L L2 = 1
(L 1 -{e})(L 2
L c X* and L 2 X* be in (£. not in L or L 2 This assump-
let
x
c is
.
x
-{e})uL' u 1
L'2 ,
0
0
where Lj is L if 6 is in L 2 and otherwise. otherwise; L 2 is L 2 if e is in L and we As ^ is closed under union, if we can show that (L — {(])(L 2 — {e}) is in shall have shown that L L 2 is in c€. Let a and 6 be symbols not in £. As # is a trio, Theorem 11.1 tells us # is closed under ofree GSM mappings. Let be the GSM that prints a, followed by its first input symbol, then copies its input, and let 2 be another GSM that prints {
t
x
2
M
b with
M
(L X ) X
its first
= aL
x
input symbol, then copies
and
M
2 (L 2 )
=
Z?L 2 ,and
l
M
its
input.
both are
in
Then as c is not in L or L 2 + and By closure under u, ,
x
%.
,
nR, [aL (
€.
is in
X. 1
Define g to be the
Then g
1.2,
11.6
We
is
u 6L 2
+
n aZ*^Z* = aL^^
)
homomorphism
g(a)
a 2-limited erasing, since Lj and
g(aL bL 2 ) {
x
= L L2 x
is
= g(b) = c, and ^f(c) = c for all c in L 2 are assumed c-free. By Lemma
in <€.
SUMMARY list in
Fig. 11.3
AFL's are
closed.
some operations under which trios, full trios, AFL's and The properties have all been proved in this chapter or
exercises. Recall that the regular sets,
and recursive
Some
sets are
AFL's. The
other operations do not
CFL's, and
DCFL's fit
r.e.
full
are
AFL's; the CSL's however.
sets are full
are not even trios,
into the theory of trios
and AFL's. In Fig.
we summarize the closure properties of six classes of languages under these operations. The question of whether the CSL's are closed under complementation 11.4
is
t
a long-standing open problem, and
We
use
n R
is
equivalent to their closure under
for "intersection with a regular set," h for
homomorphism." The dot stands
for concatenation.
"homomorphism," and h'
1
MIN.
for "inverse
CLOSURE PROPERTIES OF FAMILIES OF LANGUAGES
280
Trio
h~
Full trio
AFL
Full
AFL
l
ofree h h
nR ofree
GSM
GSM
mappings
Inverse
mappings
GSM
mappings
Limited erasing
Quotient with regular
set
INIT Substitution into regular sets Substitution by ofree regular sets
Substitution by regular sets Fig. 11.3
While
this
Summary
chapter has concerned
sion properties,
we have reached
well, for the six classes of
a
of closure properties.
itself
with closure properties and not deci-
good point
languages mentioned
to
summarize these properties
in Fig. 11.4.
We show
as
in Fig. 11.5
is decidable for the six classes. D means means undecidable, T means trivially decidable (because the answer is always "yes"), and ? means the answer is not known. The results in Fig. 1 1.5 are proved in various chapters, chiefly Chapters 3, 6, 8, and 10.
whether each of 10 important properties
decidable,
U
Recursive
r.e.
DCFL's
CSL's
sets
sets
y
?
y
/
y
Regular
CFL's
sets
Complementation
y
Intersection
/ y
Substitution
MIN
/
MAX
/
CYCLE
y
Reversal
y
y y
?
y V
Fig. 11.4
Some
other closure properties.
Regular
Question
Recursive
r.e.
sets
DCFL's
CFL's
CSL's
sets
sets
Is
w
L?
D
D
D
D
D
U
Is
L= 0?
D
D
D
U
U
U
Is
L = Z*?
D
D
U
U
V
U
Is
Li
= L2 ?
D
?
U
U
V
U
Is
Li
c L2 ?
D
V
U
U
U
u
Is
L,nL 2 = 0?
D
u
U
U
U
u
regular set?
D
D
U
U
U
u
L
T
D
U
U
u
u
T
U
u
T
T
T
T
U
u
?
T
u
Is
Is
in
L= K is
where a given
regular?
Is the intersection
of
two languages a
language of the
same type? Is
the
complement
of a language also a language
of the
same type? Fig.
1
1.5
Some
decision properties.
281
CLOSURE PROPERTIES OF FAMILIES OF LANGUAGES
282
EXERCISES *S11.1
Show
that the linear languages are
11.2
Show
that the c-free regular sets are
11.3
Show
that a
a
full trio
SUB, and FIN, where
closed under INIT,
full trio is
SUB(L) =
is
in
L
for
wx
is
in
L
wxy
{x
but not an AFL.
an AFL.
|
some w and
y},
and
FIN(L)
Show
11.4
AFL
that not every
=
{x
|
for
closed under
is
some
w}.
INIT, SUB, FIN, quotient with a
*, h,
regular set or substitution by regular sets.
Show
*11.5
that not every
The
closed under u, \
full trio is
+
*,
or substitution into regular
,
linear languages suffice for all but union.
(To prove that certain languages are not linear, use Exercise 6.11). To show nonclosure under union, find two full trios and (€ 2 containing languages Li and L 2 respectively, such that L, u L 2 is in neither ^ j nor sets.
[Hint:
,
€ 2 Show
(
u #2
^j
that
.
also a full trio.]
is
Prove each of the closure and nonclosure properties in Fig. in previous exercises or proved in previous theorems).
11.6
(some have been
11.4
asked for
The
* 1 1.7
{w
1
Show 11.8 a)
interleaving of
x,w 2 x 2
that
if
(
€
wk x k \k
•••
is
any
two languages L and L 2 denoted IL(L U L 2 ), is
trio,
,
x
L
arbitrary, (
and
in €,
is
w w2
wk
1
R
is
a regular
Are the following closed under IL? c) CSL's regular sets b) CFL's
(-input.
An A-transducer Show that every
11.10
Find a
11.9
* 11.11
GSM
is
GSM
a
full trio is
maps
that
that
L and x x 2 • xk
in
is
is
•
t
x
set,
then IL(L, R)
is in
d) recursive sets
may move (make
L 2 }.t
is in
f
€.
e) r.e. sets
output and change
state)
on
closed under /4-transductions.
a to the set {ajb 1
k
\i
Show
that any class of languages closed under
Show
that
<
k
2i} for all
h~ \ and
h,
i.
nR
is
closed under
union. * 11.12
** 11.13
•,
b) u,
•,
u,
* 11.14
class of languages closed
Give examples of
a) u,
c)
any
•,
under h h~
l
y
classes of languages closed
,
-,
and
u
is
nR.
closed under
under
+ <-free h, h~\ and nR, but not ; 1 + nR, but not h~ c-free h, and , ; + 1 fi" , and nR, but not c-free h. ,
Show
that an
AFL
is
closed under complementation
if
and only
if it is
closed under
MIN. * 11.15
A„)-+
we
A
scattered-context grammar,
(a,,
a„),
where each
a, is in
(V
(V, T, P, S), has productions of the form (A l9
u
T)
+
If
.
(A
ly
An)
->
(a^
a„)
write
P A p2 A 2 l
t
G=
Note some
w's
and
x's
l
may
be
i.
•
•
Pn A„Pn+l =>p
i
oc l
p 2
P„
.
is
in P,
...»
then
EXERCISES
Let ^> be the reflexive, transitive closure of
and S
=>.
The language generated by G
{x
is
What
productions with the
An AFL #
** 11.16
Do
the
CFL's form a
Let
if
there
principal
AFL
is
we allow
a language
L such
that
# is the least AFL
\
be an
...
AFL?
containing {a"b n n infinite
Show
that
AFL
an
if
>
is
0}
properly contained in the CFL's.
sequence of AFL's such that
Prove that the union of the #,'s forms an d) Give an example of a nonprincipal AFL.
11.17
if
possibly £?
a,'s
said to be principal
is
b) Prove that the least c)
T+
is in
L
containing a)
x
U x}.
Prove that the scattered-context languages form an AFL. class of languages is generated by the scattered-context grammars
a)
b)
\
283
AFL
that
is
closed under intersection, then
is
% for
£
{
all
i
>
0.
not principal.
under
closed
is
it
substitution.
Solutions to Selected Exercises
To prove
11.1
grammar and
A
->
homomorphism, let G be a linear A -> wBx or A -> y is replaced by resulting grammar generates h(L(G)). To
that the linear languages are closed under
homomorphism.
h a
A
h(w)Bh(x) or
each production
If
-> h(y), respectively, then the
show closure under h~
l
nR
and
we could
y
use machine-based proofs analogous to the
proofs for CFL's, since by Exercise 6.13(a), the linear languages are characterized by
We
one-turn PDA's.
G=
Let
shall instead give
(K, 7, P, S) be a linear
grammar-based
CFG, and
grammar G' = (V\ T, P\S') generating L(G) n L(M). A in V] \j {S'}. Then define P' to have productions
0
S*
^)
[^P]
w)
Let
5,
all p in F, w[rBs]x whenever A ->
y whenever
,4
->
y
is
in
w£x is in P, P and <5(g, y) =
Thus S'=>vv
p.
=
L(G')
,
DFA. Construct
linear
w)
=
n
L(G)
Now,
if
and
only
if
r
and
<5(s,
=
x)
p,
and
p.
easy induction on derivation length shows that [qAp] =>
=
q 0 F) a
V = {[qAp] \q and p are in Q and
-+[q 0 Sp] for
3) [
An
proofs.
M = (Q, T,
S=>w and
d(q 0
,
w
and only
if
vv)
is
a
if
final
A
=>
w and
state.
<5(g,
Hence
L(M).
G =
(K, T, P, S) be a linear grammar and h: Z* -» T* a homomorphism. < /c, and if A - wBx or /I -»• w is in P, then Suppose is such that for all a in Z, |w| < /c and |x| < fc. Let G" = (K", Z, P", [5]), where K" consists of all symbols [wAx] such that /I is in V, and w and x in T* are each of length at most 2k — 1. Also in V" are symbols [y], where \y < 3k — 1. Intuitively G" simulates a derivation of G in its variable let
\
until the string of terminals either to the right or to the
length at least
what
is
k.
Then G" produces a terminal a on
w Bx [w 2 w Bx x 2 ] A
->
y
l
2)
of the variable of left
G
is
of
and deletes h(a) from
stored in the variable.
The productions 1) If
left
the right or
For a
l
of P" are: in P,
is
x
is
then for
in P". If
A
->
y
all is
w 2 and
in P, then
x 2 of length at most k - 1, [w 2 /Lx 2 ] -* [w 2 /lx 2 ] -* [w 2 yx 2 ] is in P".
in Z,
[/i^Wi/lx^
->
fl[
w i^ x il
[w^x^fa)]
-* [wjAxjJa,
and
[M fl )v]
->
fl[y].
CLOSURE PROPERTIES OF FAMILIES OF LANGUAGES
284
3) It
[€]-€.
follows by induction on derivation length that
[S]^>w l [w 2 Ax 2 ]x if
and only
S=j>h(w l )w 2
Thus
[S]
f>
v
To show
if
and only
not,
is
S §>
if
h(v) y
Ax 2 h(x
and hence L(G")
that the linear languages are not an
concatenation. Surely ation
l
if
>
i 1
by Exercise
1}
and
{cj dj \j
>
l ).
=
h~ '(^(G)).
AFL, we show
1}
they are not closed under
are linear languages, but their concaten-
6.12.
BIBLIOGRAPHIC NOTES The study of abstract families of languages was initiated by Ginsburg and Greibach [1969], who proved Theorems 11.1 through 11.5 and Lemma 11.1. The central importance of the trio in this theory is pointed out by Ginsburg [1975]. Theorem 1 1.6 on independence of the operators appears in Greibach and Hopcroft [1969]; a solution to Exercise 11.13 can also be found there. The notion of limited erasing is also due to Greibach and Hopcroft [1969]. That AFL's closed under intersection are closed under substitution was first proved by Ginsburg and Hopcroft [1970]. An enormous amount of literature concerns itself with abstract families of languages; we mention only Ginsburg and Greibach [1970], dealing with principal AFL's (Exercise 1.16), and Greibach [1970], who attempts to work substitution into the theory. A summary and additional references can be found in Ginsburg [1975]. The theory of families of languages has, from its inception, been connected with the theory of automata. Ginsburg and Greibach [1969] show that a family of languages is a full 1
AFL
if
input.
and only
Of
if it is
defined by a family of nondeterministic automata with a one-way
course, the notion of a "family of
roughly, each such family
is
automata" must be suitably defined,
characterized by a set of rules whereby
it
may access
but,
or update
its storage. The "if" part was proved independently in Hopcroft and Ullman [1967b]. Chandler [1969] characterized families of deterministic automata with a one-way input, in terms of closure properties, and Aho and Ullman [1970] did the same for deterministic automata with a two-way input. Curiously, no characterization for two-way nondeterministic automata is known. There have also been attempts to codify a theory of grammars, chiefly subfamilies of the CFG's. Gabriellian and Ginsburg [1974] and Cremers and Ginsburg [1975] wrote the
basic papers
The
in this area.
GSM
properties
was defined by Ginsburg [1962], and the study of GSM mappings and
commenced with Ginsburg and Rose
[1963b].
An
their
important unresolved issue
concerns testing for equivalence of two sequential transducers. That equivalence
is
decid-
Moore machines (and hence for Mealy machines, which GSM's generalize) was known since Moore [1956]. Griffiths [1968] showed that the equivalence problem for e-free GSM's was undecidable, while Bird [1973] gave a decision algorithm for the equivalence of two-tape automata, which are more general than deterministic GSM's.
able for
Scattered-context [1969].
grammars
(Exercise 11.15) are discussed in Greibach and Hopcroft
CHAPTER
12
COMPUTATIONAL COMPLEXITY THEORY
Language theory
classifies sets
by
Thus regular automaton has
their structural complexity.
are regarded as "simpler" than CFL's, because the finite
sets less
complex structure than a PDA. Another classification, called computational complexity, is based on the amount of time, space, or other resource needed to recognize a language on some universal computing device, such as a Turing machine.
Although computational complexity space, there are
many
is
primarily concerned with time and
other possible measures, such as the
the direction of travel of the tape head
number of reversals
on a single-tape TM. In
a complexity measure abstractly and prove
many
fact
in
one can define
of the results in a
more general
We
choose to present the results for the specific examples of time and space, since this approach renders the proofs more intuitive. In Section 12.7 we briefly outline the more abstract approach. setting.
12.1
DEFINITIONS
Space complexity Consider the
off-line
Turing machine
M of Fig.
12.1.
M has a read-only input tape
with endmarkers and k semi-infinite storage tapes. If for every input word of length
h,
M scans at most S(n)
cells
on any storage
tape, then
M
S(n) space-bounded Turing machine, or of space complexity S(n).
recognized by
M
is
also said to be of space complexity S(n).
285
is
said to be an
The language
286
COMPUTATIONAL COMPLEXITY THEORY
Input
i
$
Finite
control
Storage tapes
V
Fig. 12.1
Note
that the
Multitape Turing machine with read-only input.
Turing machine cannot rewrite on the input and that only the computing the tape bound. This restric-
length of the storage tapes used counts in tion enables us to consider tape
bounds
ofless than linear growth. If the
rewrite on the input tape, then the length of the input
calculating the space bound.
would have
Thus no space bound could be
TM could
to be included in
less
than linear.
Time complexity Consider the multitape
TM M of Fig.
one of which contains the input. All
12.2.
The
TM has k two-way infinite tapes,
tapes, including the input tape,
Finite
control
K
\
Input region
v—
Storage tapes
Fig. 12.2
Multitape Turing machine.
may be written
12.1
upon.
If for every input
M
halting, then
is
word of length
n,
M makes
at
DEFINITIONS
|
287
most T(n) moves before
said to be a T(n) time-bounded Turing machine, or of time
complexity T(n). The language recognized by
M
is
said to be of time complexity
T(n).
The two
different
models
and space complexity were selected with an some variation in the models is then we can use the single tape as our model
for time
eye toward making certain proofs simple, and
For example,
feasible.
if S(n)
>
n,
TM
without changing the class of languages accepted in space however, when discussing time complexity, use the single tape
any
fixed
number of tapes, without
of languages accepted
Example
12.1
in
S(n).
TM,
We or
some languages from
possibly losing
cannot,
TM's with the class
time T(n).
Consider the language
L={wov*|win (0+1)*}. Language L is of time complexity n + 1, since there is a Turing machine Mj, with two tapes, that copies the input to the left of the c onto the second tape. Then, when a c is found, moves its second tape head to the left, through the string it x has just copied, and simultaneously continues to move its input tape head to the right. The symbols under the two heads are compared as the heads move. If all pairs of symbols match and if, in addition, the number of symbols to the right and left of the lone c are equal, then M, accepts. It is easy to see that M, makes at most n + 1 moves if the input is of length n.
M
There
M
2
is
another Turing machine,
M
2,
of space complexity log 2 rc accepting L.
uses two storage tapes for binary counters. First, the input
checked to see
is
one c appears, and that there are equal numbers of symbols to the right of the c. Next the words on the right and left are compared symbol by
that only
and
left
symbol, using the counters to find corresponding symbols.
symbols match,
halts without accepting. If all
M
If
they disagree,
M
2
accepts.
2
Special assumptions about time and space complexity functions It
should be obvious that every
a space complexity measure,
TM
uses at least one cell on
we may assume
S(n)
>
1
for all
all n.
inputs, so
if
S(n)
is
We make the useful
assumption that when we talk of "space complexity S(n)" we really mean (1, [5(71)1). For example, in Example 12.1, we said that 2 was of "space complexity log 2 rc." This makes no sense for n = 0 or 1, unless one accepts that
TM M
max
"log 2 >i"
is
max
shorthand for
Similarly, at least n -f
1,
it is
(1,
Hog 2 /il).
reasonable to assume that any time complexity function T(n)
for this
is
end has been reached by reading the t Note, however, that there are
to eliminate
is
the time needed just to read the input and verify that the
TM's
first
blank. t
We
thus
that accept or reject without reading
them from consideration.
make
the convention
all their
input.
We
choose
COMPUTATIONAL COMPLEXITY THEORY
288
that "time complexity T{n)"
(n
+
not
0,
means max
=
time complexity n log 2 w at n
1 is 2,
1, \T(n)\ For example, the value of and at n = 2, its value is 3.
Nondeterministic time and space complexity
The concepts of
time- and space-bounded Turing machines apply equally well to
A
nondeterministic machines.
nondeterministic
TM
is
of time complexity T(n)
if
no sequence of choices of move causes the machine to make more than T(n) moves. It is of space complexity S(n) if no sequence of choices enables it to scan more than S(n) cells on any storage tape.
Complexity classes
The
family of languages of space complexity S(n)
is
denoted by DSPACE(S(rc));
the languages of nondeterministic space complexity S(n) are collectively called
NSPACE(S(n)). The family of languages of time complexity T(n) is denoted DTIME(T(n)) and that of nondeterministic time complexity T(n) is denoted NTIME(T(n)). All these families of languages are called complexity classes. For example, language L of Example 12.1 is in DTIME(n)t and in DSPACE(log 2 «). L is therefore also in NTIME(n) and NSPACE(log 2 n) as well as larger classes such as
12.2
IN
DTIME(h 2
or
)
NSPACE^).
LINEAR SPEED-UP, TAPE COMPRESSION, AND REDUCTIONS THE NUMBER OF TAPES
Since the
number of states and the tape alphabet size of a Turing machine can be amount of space needed to recognize a set can always be
arbitrarily large, the
compressed by a constant
factor.
This
Thus
in
ratic,
exponential) that
complexity results is
the functional rate of growth
is
it
achieved by encoding several tape symup a computation by a constant factor.
is
bols into one. Similarly one can speed
important, and constant factors
(e.g., linear,
may
quad-
be ignored. For
example, we shall talk about complexity log n without specifying the base of logarithms, since \og h n and log c n differ by a constant factor, namely log b c. In this section
we
establish the basic facts concerning linear speed
sion as well as considering the effect of the
number of
up and compres-
tapes on complexity.
Tape compression
Theorem
12.1
If
L
is
accepted by an S(n) space-bounded Turing machine with k
storage tapes, then for any c t Recall that n really +
Note
that
means max
(n
> +
by our convention, cS(n)
0,
1,
is
L is n)
=
accepted by a cS(n) space-bounded n
+
1
regarded as
for time complexity.
max
(1, IcS(h)I).
TM.J
12.2
LINEAR SPEED-UP, TAPE COMPRESSION
|
289
Proof Let Mi be an S(n) tape-bounded off-line Turing machine accepting L. The proof turns on constructing a new Turing machine 2 that simulates u where for some constant r, each storage tape cell of 2 holds a symbol representing the contents of r adjacent cells of the corresponding tape of The finite control of x which of the cells can keep track of of among those represented, is 2 u actually scanned by v Detailed construction of the rules of are left to the 1 2 from the rules of reader. Let r be such that rc > 2. using no more than \S(n)/r] 2 can simulate x cells on any tape. If S(n) > r, this number is no more than cS(n). If S(n) < r, then 2 can store in one cell the contents of any tape. Thus, 2 uses only one cell in
M
M
M
M
M
M
M
M
M
.
M
M
M
M
the latter case.
Corollary
If
L
is
NSPACE(S(n)), then
in
L
in
is
NSPACE(cS(n)), where
c
is
any
constant greater than zero.
Proof
If
M
1
above
is
nondeterministic,
let
M
2
be nondeterministic
in the
above
construction.
Reduction
Theorem
in the
language
If a
12.2
storage tapes,
number of tapes
it
is
L
for space complexity classes
is
accepted by an S(n) space-bounded
accepted by an S(n) space-bounded
TM
TM with k
with a single storage
tape.
Proof
Let
We may
M
t
be an S(n) space-bounded
construct a
storage tapes of
M
no more than S(n)
new
on k
x
TM M
tracks.
TM
with k storage tapes, accepting L.
The technique was used
in
Theorem
M
7.2.
2
uses
cells.
From now on we assume storage tape, and
TM
with one storage tape, which simulates the
2
if
S(n)
>
rc,
that
then
it
any S(n) space-bounded TM has but one a single-tape TM, rather than an off-line
is
with one storage tape and one input tape.
Linear speed up Before considering time bounds,
let
us introduce the following notation. Let f(n)
The expression sup,,^^ f(n) is taken to be the limit as n -> oo of Likewise, \n[n ^^f(n) is the the least upper bound of f(n),f(n + 1),/(h + 2), If /(h) limit as w-> oo of the greatest lower bound of f(n\f(n + l),f(n 4- 2), converges to a limit as n -* oo, then that limit is both in^^^ f(n) and sup,,.^ f(n). be a function of
n.
.
.
.
.
.
.
Example 12.2 Let f(n) = l/n for n even, and f(n) = n for n odd. The least upper bound of f(n),f(n 4- 1), ... is clearly oo for any n, because of the terms for odd n. Hence sup^^ f (n) = oo. However, because of the terms with n even, it is also true that
inf^ co /(n)
=
0.
COMPUTATIONAL COMPLEXITY THEORY
290
For another example, suppose f(n) of n/(n
The
lower bound
greatest
+
lim,,^ n/(n
Theorem
is
provided that
First
M
2
Then
4- 1).
(n 4- l)/(n 4- 2),
4- 1),
=
bound
the least upper
Thus
.
.
is
.
n/(n
+
1)
and
1,
so inf,,.^
If
L
is
accepted by a /c-tape T(n) time-bounded Turing machine
n/(n 4- 1)
1
as well.
accepted by a /c-tape cT(w) time-bounded
>
/c
and
1
2
=
inf,,.^ T(n)/n
can be constructed to simulate
be determined
later.)
TM M 2
for
any
c> 0,
oo.
M,
in the
From
following manner.
m symbols into one. (The
copies the input onto a storage tape, encoding
m will
value of
n.
=
TM M
A
Protf/
n/(n
1)
12.3
Mi, then L
of n/(n
=
any
4- 1), (n 4- l)/(n 4- 2), ... is 1 for
this point on,
M
2
uses this storage tape
M
and uses the old input tape as a storage tape. 2 will encode the contents of M/s storage tapes by combining m symbols into one. During the course of the simulation, 2 simulates a large number of moves of M, in one basic of step consisting of eight moves of 2 Call the cells currently scanned by each 2 records, for each tape, which of 2 's heads the home cells. The finite control of the m symbols of M, represented by each home cell is scanned by the corresponding head of 2 To begin a basic step, 2 moves each head to the left once, to the right twice, and to the left once, recording the symbols to the left and right of the home cells in to its finite control. Four moves of 2 has returned 2 are required, after which as the input tape
M
M
.
M
M
M
.
M
M
M
its
home
cells.
M
Next,
home
2
determines the contents of all of
M/s
tape cells represented by the
when some tape head of by the home cell and its left and right the neighbors. (Note that this calculation by 2 takes no time. It is built into repretape head leaves the before some transition rules of If accepts .) 2 on halts, sented region, 2 then visits, 2 halts. Otherwise 2 accepts. If each tape, the two neighbors of the home cell, changing these symbols and that of represents the home cell if necessary. 2 positions each of its heads at the cell that the symbol that M/s corresponding head is scanning at the end of the moves
M,
cells
first
and
their
left
and
right neighbors at the time
leaves the region represented
M
M
M
M
1
M
M
M
x
M
simulated. At most four It
takes at least
by a home
cell
and
moves of
m moves its
M
M,
for
2
are needed.
to
neighbors. Thus,
move
a head out of the region represented
in eight
moves,
M
2
has simulated at
least
m moves of M,. Choose m such that cm > 16. If Mj makes T(n) moves, then M 2 simulates these in Also,
M
2
must copy and encode
the simulated input tape to the
its
left
n+
at most 8[T(tt)/mlmoves. input (m cells to one), then return the head of end. This takes n 4- In/ml moves, for a total of
\n/m]+S\T(n)/m]
(12.1)
12.2
moves. As
\x]
<
x
any
4- 1 for
LINEAR SPEED-UP, TAPE COMPRESSION
|
n
upper bounded by
x, (12.1) is
n/m
4-
&T(n)/m
4-
291
4- 2.
(12.2)
Now we have assumed that int^^ T(n)/n = oo, so for any constant d there is an n d such that for all n > n d T(n)/n > d, or put another way, n < T(n)/d. Thus whenever n > 2 (so n + 2 < 2n) and n > nd (12.2) is bounded above by ,
,
T(n)
We
have not yet specified
choose
d
> max
—
m/4
and
4- i,
d.
8
2
m
d
J_ md
Remembering
substitute
16/c
(12.3)
that for
m was chosen so that cm > m in (12.3). Then for
16, all
M
number of moves made by 2 does not exceed cT(n). To recognize the finite number of words of length less than the maximum of 2 4- 1 moves to read its input and and d 2 uses its finite control only, taking n reach the blank marking the end of the input. Thus the time complexity of 2 is n
rc
,
w d ) the
(2,
M
M
max
cT(n). Recall that for time complexity, cT(n) stands for
Corollary
If inf,,.^ T(n)/n
=
and
oo
c
>
0,
(n 4-
1,
\cT(n)]).
then
DTIME(T(h)) = DTIME(c7(n)). Proof
Theorem
is
in
accepted by a 2-tape
TM
Theorem
12.3
time T(n). Clearly of the
if
L
is
accepted by a
accepted by a 1-tape
it
is
same time complexity.
is
does not apply
a constant, not
T(n)
if
is
a constant multiple of
m,
as then
infinity.
M
Theorem 12.4 If L is accepted by a fc-tape cn time-bounded TM, for k for some constant c, then for every e > 0, L is accepted by a fc-tape time-bounded
Proof
Pick
Corollary for
DTM
TM,
However, the construction of Theorem with a more careful analysis of the time bound of 2 shows the following.
inf,,.^ T(n)/n 12.3,
L
a direct proof for any language
12.3
with 2 or more tapes
any
c
1
and
(1 -f ()«
TM.
m=
If
>
>
l/\6c in the proof of
T(n)
=
cn for
some
c>
Theorem
1,
then
12.3.
DTIME(7(n)) = DTIME((1
4- c)n)
0.
Corollary (of Theorems 12.3 and 12.4)
inf^ T(n)/n = oo, then NTIME(7(n)) = NTIME(cT(h)) for any c> 0. = cn for some constant c, then NTIME(7») = NTIME((1 4- c)n\ for any e > 0.
a) If
b) If T(n)
-Pro*?/
The proofs
are analogous to
Theorems
12.3
and
12.4.
COMPUTATIONAL COMPLEXITY THEORY
292
Reduction
Now
let
tape.
A
number of tapes
in the
for time complexity classes
us see what happens to time complexity
language
L = {wcw R w
like
on a two-tape machine, as we saw
L
machine,
may
requires time cn
2
for
+
in (a
is
|
when we
Example
in
some
>
c
restrict ourselves to
one
b)*} can be recognized in linear time 12.1.
However, on a one-tape
how
(The exercises give hints
0.
this
be proved.) Thus permitting only one tape can square the time necessary to
recognize a language. That this
the worst that can happen
is
is
expressed
in the
next theorem.
Theorem
12.5
one-tape
TM.
Proof
L
If
DTIME(T(w)), then L
in
is
In the construction of
M
TM,
one-tape
Theorem
Corollary
accepting
If
L
is
Theorem
time
T 2 (n)
by a
TM
to a
going from a multitape
in
L
in
T 2 (n)
steps.
NTIME(T(h)), then L
accepted by a one-tape
is
NTM
of
T 2 (n).
Analogous to the proof of the theorem.
Proof If
in
6T 2 (n) steps to simulate T(n) steps of M\. By up M, to run in time T(n)/^/6. Then M 2 is a
nondeterministic time complexity
we
7.2,
accepted
uses at most
we may speed
12.3,
TM
one-tape
2
is
we
restrict
two tapes, the time loss is considerably one tape, as the next theorem shows.
ourselves to
restrict ourselves to
less
than
if
Theorem 12.6 If L is accepted by a fc-tape T(n) time-bounded Turing machine Mj, then L is accepted by a two-storage tape TM 2 in time T(n) log T(n).
M
M
of Proof The first storage tape of 2 will have two tracks for each storage tape M,. For convenience, we focus on two tracks corresponding to a particular tape of are simulated in exactly the same way. The second tape Mj. The other tapes of is used only for scratch, of to transport blocks of data on tape 1. 2 One particular cell of tape 1, known as B 0 will hold the storage symbols
M
x
M
,
M
M
will Rather than moving head markers, 2 transport data across B 0 in the direction opposite to that of the motion of the head of being simulated. Thus by looking only at 2 can simulate each move of
scanned by each of the heads of
M .
the right of cell
length; that
B_ 2
.
M
M
,
B 0 To
x
B
is,
will
#2
be blocks
of length 2'~
is t
with B_i
,
B0
!
l .
x
of exponentially increasing
...
Likewise, to the
.
having length 2'"
,
of
left
B0
are blocks
The markers between blocks
to exist, although they will not actually
appear
until the block
is
are assumed
used.
Let a 0 denote the contents of the cell initially scanned by this tape head of MiThe contents of the cells to the right of this cell are a u a 2 ...,and those to the left, a_ j, a. 2 ... The values of the a^s may change when they enter B 0 it is not their ,
,
\
values, but their positions
the upper track of
the lower track
blocks ...,£_
is
2,
M
2
on the tracks of tape
for the tape of
assumed
to hold
B0 B u B2 ,
,
.
.
M, .
,
in
a_
as
1
of
M
question
2,
a_
i9
shown
a0
2
,
that
important.
Initially
assumed to be empty, while a ly a 2 ... These are placed in
is ,
is
,
in Fig. 12.3.
12.2
a -l
a -6
5
a -5
a -4
LINEAR SPEED-UP, TAPE COMPRESSION
|
a
0-3 a -2
u
0
dt
As mentioned
Blocks on tape
tf
«6
^5
7
1.
B 0 and
previously, data will be shifted across
passes through. After the simulation of each
it
«4
]
-3 Fig. 12.3
as
a
1
293
move
of
perhaps changed
M u the following will
hold.
For any i > 0, either B, is full (both tracks) and B_, is empty, or B, is empty and B_, is full, or the bottom tracks of both B, and B_, are full, while the upper tracks are empty.
1)
2)
The contents of any represented. For i >
B, or
the lower track; for
<
0,
i
B_, represent consecutive
cells
on the tape of
the upper track represents cells to the 0,
M
t
of those of
left
the upper track represents cells to the right of those
of the lower track. 3)
For
4)
B0
i
< j,
Bi represents cells to the
always has only
its
left
lower track
of those of
filled,
and
By
its
upper track
is
specially
in
question
marked.
To moves moves the
see
how
to the
data
left.
is
the head of tape
first
block, say
B0 B u
Bh
transferred, imagine that the tape
head of
M
l
M
Then
2 must shift the corresponding data right. To do from B 0 where it rests, and goes to the right until
1
,
that does not have both tracks
full.
Then
M
2
copies
so, it
M
2
finds
all
the
onto tape 2 and stores it in the lower track of B,, B 2 B,_ plus the lower track of B assuming that the lower track of B, is not already filled. If the lower track of B, is already filled, the upper track of B, is used instead. In either case, there is just enough room to distribute the data. Also note the data can be picked up and stored in its new location in time proportional to the length data of
,
.
.
B,_
. ,
,
l
f ,
!
ofB,. Next, in time proportional to the length of B h T can find B_, (using tape 2 to measure the distance from B, to B 0 makes this easy). If B_, is completely full, T picks up the upper track of B_, and stores it on tape 2. If B_, is half full, the lower track is put on tape 2. In either case, what has been copied to tape 2 is next copied to the lower tracks of B_ _ n B_ _ 2) B 0 (By Rule 1, these tracks have to be empty, since B,, B 2 were full.) Again, note that there is just enough room B,-_ x
x
(I
,
to store the data,
.
.
and
.
,
B
(l
,
.
the above operations can be carried out in time propor-
all
tional to the length of
,
{
. f
Also note that the data can be distributed
in
a
manner
(2), and (3), above. We call all that we have described above a B r operation. The case in which the head of M, moves to the right is analogous. The successive contents of the blocks as M, moves its tape head in question five cells to the left are shown in Fig. 12.4.
that satisfies rules (1),
COMPUTATIONAL COMPLEXITY THEORY
294
5_ 3
a -l
5-6
5-5
5_ 2
5-4
5-3
5-2
5_,
B0
B
5-1
50
*\
B3
B2
{
52
53
54
55
56
57
52
53
54
55
56
57
50
31
5-1
52
53
54
55
56
57
5-2
50
a
5-1
a2
53
54
55
56
57
50
5l
a2
53
54
55
56
57
50
5l
a2
53
54
55
56
57
50
5-7
5-7
5-7
5-6
5-5
5-6
5-5
5-6
5-5
5-4
5-3
5-2
5-4
5-3
5.
5-2
5-4
5-3
5-7
5-7
5-6
5
5-4
5
5-6
Fig. 12.4
We
5-1
5-3
5-5
5
4
5
3
5
-2
5-1
5-1
Contents of blocks of
M
\
M
x
.
must perform a 5,-operation at most are B £ 2 .., 1? which half empty after a # r operation, to fill. Also, a ^.-operation cannot be performed for the first time until the 2'~ *th move of M,. Hence, if M, operates in time T(n), note that for each tape of M,,
once per 2
M
1
"
1
moves of Mj,
since
2
takes this long for
it
x ,
,
.
perform only /^-operations, for those i such that i < log 2 T(n) + 1. moves have seen that there is a constant ra, such that 2 uses at most mT to perform a 5,-operation. If M, makes T(n) moves, 2 makes at most 2 will
M M
We
log 2 T(n)+l
TM)=
I i=
1
moves when simulating one tape of
T(n\
™?^T Z
(12-4)
12.3
From
(12.4),
we
HIERARCHY THEOREMS
295
obtain 7i(n)
and from
|
= 2mT(n)\\og 2 T(n) +
(12.5)
11,
(12.5),
T (n)<4mT{n) l
\og 2 T(n).
M
The reader should be able to see that 2 operates in time proportional to even when makes moves using different storage tapes rather than only the one upon which we have concentrated. By Theorem 12.3, we can modify 2 to run in no more than T(n) log 2 T(n) steps.
M
7\(tt)
Corollary
If
L
is
l
M
accepted by a /c-tape
accepted by a two-tape
NTM
NTM
of time complexity T(n), then
L is
of time complexity T(n) log T(n).
Analogous to the proof of the theorem.
Proof
HIERARCHY THEOREMS
12.3
Intuitively, given
more time or
space,
we should be
able to recognize
more
lan-
guages or compute more functions. However, the linear speed-up and compression theorems tell us that we have to increase the available space or time by more than a constant factor. But what
we multiply
if
the space or time by a slowly growing
we cannot then recognize any new bound f(n) such that every recursive language is in DTIME(/(«)), or perhaps in DSPACE(/(«))? The answer to the last question is "no," as we shall prove in the next theorem. However, the answer to the first question depends on whether or not we start with function such as log log m? Is
languages?
Is
it
possible that
there a time or space
a "well-behaved" function. In this section
behaved" and show that and space do add to our In Section 12.6
we
for
we shall give suitable definitions
of "well
well-behaved functions, small amounts of extra time
ability to
compute.
shall consider arbitrary total recursive functions
complexity classes they define. There
we
and the
shall see that strange behavior
is
ex-
There are "gaps" in any complexity hierarchy, that is, there exists a 2 function T(n) for which DTIME(T (n)) = DTIME(T(n)), and in general, for any total recursive function / there is a time complexity Tf (n) for which DTIME(7}(n)) = DTIME(/(T/ (m))). Similar statements hold for space, and indeed for any reasonable measure of computational complexity. We shall also see that there are languages L for which no "best" recognizer exists; rather there is an infinite sequence of TM's recognizing L, each of which runs much faster than the hibited.
previous one.
Theorem
12.7
Given any
a recursive language
Proof
We
shall
The argument
is
L
show
not
total recursive in
time-bound (space-bound) T(«), there
DTIME(T(«)) or DSPACE(T(n)),
the result for time; the argument for space
basically a diagonalization. Since T(n)
is
is
respectively. is
analogous.
total recursive, there
is
a
COMPUTATIONAL COMPLEXITY THEORY
296
TM M
halting
Lc
(0
+
computes
that
1)* that
M
construct
to
{0, 1, B}.
accept
language
a
DTIME(T(n)). Let x be the ith string in In Chapter 8, we ordered single-tape TM's with f
+ 1)*. We can similarly order multitape TM's with arbitrary tape
the canonical ordering of (0
tape alphabet
We
it.
recursive but not in
is
alphabets by replacing their transition functions by binary strings. The only subis that the names of the tape symbols, like those of states, don't we may assume that all TM's whose input alphabet is {0, 1} have tape alphabet 0, 1, B, X 4 X 5 ... up to some finite X m then encode 0, 1, and B by 0, 00, and 000 and encode X by 0', i > 4. We also permit an arbitrary number of l's in
stantial point
matter, so
,
,
t
code
front of the
,
M
for
M
to represent
M
as well, so
has arbitrarily long
encodings.
We L=
{Xi
are thus free to talk about
M, does not
M„
accept x within T( {
|
|
jc t-
TM. Now define We claim L is recursive. To
the ith multitape 1
)
moves}.
recognize L, execute the following algorithm, which can surely be implemented on
M
on n to compute T(n). w of length n, simulate Then determine such that w = x,. The integer written in binary is the transition function of some multitape TM M, (if i in binary is of improper form for a transition function, then M, has no moves). Simulate M, on w for T(n) moves, accepting if M, either halts without accepting or runs for more than T(n) moves a Turing machine. Given input
i
i
and does not accept.
To
see that
bounded.
Is
x
f
L
in
definition of L,
x
is
L? t
is
not in DTIME(T(m)), suppose If so,
not
M
in
t
accepts
x,-
L=
L(M,), and
within T(n) steps, where n
M,
=
L, a contradiction. If x
accept x h so by definition of L,
x, is in
f
M
not in L, then
is
T(n) time
is
x, |
|
.
{
Thus by does not
L, again a contradiction. Both assumptions is T(n) time bounded must be
lead to contradictions, so the supposition that M,false.
If T'(n)
>
T(n) for
complexity class that function,
Theorem
all n,
it
follows immediately from the definition of a time
DTIME(T(n)) c DTIME(T'(«)).
12.7 implies there exists
a recursive
T(n)
If
set
is
L not
Let t(n) be the running time of some Turing machine accepting
a total recursive
in
DTIME(T(n)).
L and
let
T'(n)
=
max {7», f («)}. Then DTIME(T(n)) 5 DTIME(T'(n)), since L is in not the former. Thus we know that there is an infinite hierarchy of deterministic the latter but
time complexity classes. classes,
and
A
similar result holds for deterministic space complexity
for nondeterministic
Theorem
time and space classes.
demonstrates that for any recursive time or space complexity f(n), there is an f'(n) such that some language is in the complexity class defined by f'(n) but not/(n). We now show that for a well-behaved function / (n) only a slight increase in the growth rate of f(n) is required to yield a new complexity class. 12.7
Theorems 12.8 and 12.9 are concerned with the increase needed in order to obtain a new deterministic complexity class. These theorems are used later to establish lower bounds on the complexity of various problems. Similar results for nondeterministic classes are very
much more
difficult;
hierarchy for nondeterministic space in Section
12.5.
we
shall touch
on a dense
123
A
HIERARCHY THEOREMS
|
297
space hierarchy
We now
introduce our notion of a "well-behaved" space complexity function.
A
some Turing machine
M
function S(n) that
is
said to be space constructible
is
S(n) space bounded, and for each
M actually uses S(n) tape
The
cells.
m,
if
there
there
is
is
some input of length n on which
set of space-constructible functions includes
k
2", and nl If S^n) and S 2 (n) are space constructible, then so are S 1 (m)S 2 (m), 2 Sl(n) and S 1 (n) S2(n) Thus the set of space-constructible functions is
log
n,
n
,
.
,
very rich.
M
above need not use S(n) space on all inputs of length n, just on Note that some one input of that length. If for all h, in fact uses exactly S(n) cells on any input of length n, then we say S(n) is fully space constructible. Any space-
M
>
constructible S(n)
w
is
fully
space constructible (exercise).
we prove
In order to simplify the next result
Lemma
12.1
If
L
accepted by an S(n)
is
accepted by an S(n) space-bounded
Proof t
Let
TM
>
+
\og 2 n space-bounded
that halts
M be an S(n) space-bounded M accepts,
2)sS(n)t
S(n)
it
on
s states, S(n)
then
L
is
s states
and
does so by a sequence of at most
moves, since otherwise some ID repeats. That
input head positions,
TM,
all inputs.
Turing machine with
off-line
tape symbols accepting L. If
(n
the following lemma.
tape head positions, and
is,
there are n
S(n) t
+
2
storage tape
M
added as a move counter, can shut itself off sets up a counter of length scans a new cell beyond the cells conlog n, and counts in base 4s*. Whenever taining the counter, increases the counter length. Thus if loops having used contents. If an additional track after (4stf
{n)
>
(n
+
2)sS(n)t
S(n)
is
moves. Actually,
M
M
only 4sr (
i
tape
Theorem
n) >
12.8
whjch If
js
S 2 (n)
^ is
+
and
and S 2 (n) are each not
in
2) S S(n)t
S(
when
the count
reaches
"K
a fully space-constructible function,
n^oo
DSPACE(S 2 (h))
M
then the counter will detect this
cells,
ynax(,Mog 2
M
at
S 2 {n) least
log 2 n, then
there
is
a
language
in
DSPACEfS^n)).
The theorem is proved by diagonalization. Consider an enumeration of Turing machines with input alphabet {0, 1} and one storage tape, based on the binary encoding of Section 8.3, but with a prefix of l's permitted, so each that uses S 2 (n) space and has arbitrarily long encodings. We construct a disagrees on at least one input with any S^n) space-bounded TM. On input w, begins by marking S 2 (n) cells on a tape, where n is the length of w. Since S 2 (n) is fully space constructible, this can be done by simulating a Proof
off-line
TM
TM M
M
TM
S 2 (n) cells on each input of length n. In what follows, if leave the marked cells, halts and rejects w. This guarantees that
that uses exactly
attempts to is
S 2 (n) space bounded.
M
M M
COMPUTATIONAL COMPLEXITY THEORY
298
M begins a simulation on input w of TM M w the TM encoded by binary M w S space bounded and has tape symbols, then the simulation requires space \\og ^n). M accepts w only M can complete the simulation space and M w halts without accepting S DSPACE(S L(M) not space bounded, L(M) Since M S For suppose there were an S^n) space-bounded TM M with DSPACE(5 tape symbols accepting L(M). By Lemma 12.1 we may assume that M halts on often in the enumeration, and inputs. Since M appears Next
,
string w. If
is
t
(n)
t
in
if
2 t~\S
2 (n)
jc.
2 (n)
is
in
is
2 (n)).
in
is
(m)).
1
t
all
infinitely
there exists a sufficiently long w, |w| is
M. On
Mw
input w,
n,
such that riog 2 fl5 1 (/i)
L(M W L(M), a contradiction. DSPACE^Jn)).
Thus
rejects.
DSPACE(S 2 (n))
=
<
M
5 2 (m) and w if and only if
M has sufficient space to simulate M w and accept Thus
)
L(M)
in
is
but not in
common
While most
functions are fully space constructible,
make Theorem
space constructibility to
12.8
We
go through.
we need only
therefore state the
following.
Theorem
Corollary
holds even
12.8
if
S 2 (n)
is
space constructible but not
fully
space constructible.
Proof
M,
Let
be a
is,
M
the input to
the second
TM
that constructs
is
treated as
modification to the design of
M
simulating
on M's
x
is
M
TM
a string
in
input. Let
is
track.
Z" that causes
first is
that
Z
be the input
x
{0, 1}.
That
used as input to
Z
with input alphabet
x
{0, 1}.
M
M must lay off blocks on tapes and 2 by that M disagrees with any S 1
to use
x
Mu
The only
We may show
TM M on an input whose length,
space-bounded track
first
had two tracks: the
if it
the code of a
as
S 2 (n) on some
M to accept a language over alphabet Z
We design
alphabet of M,.
S 2 (n)
x
n, is
sufficiently large,
cells,
whose
(n)
first
and whose second track
is
an encoding of M. leave as an exercise a proof that the condition S 2 (n) > \og 2 n in Theorem and its corollary are not really needed. The proof is not a diagonalization, but hinges on showing that
We
12.8
{wc'w is
accepted
and S 2 (n) Note
|
in
<
w
is
in (a
+
b)*, |
S 2 (n) space but not
w = |
in
and
S 2 (n)
S^n) space
inf
5lW
n-oo
S 2 (n)
i
= n- 2S 2 (n)}
if
0
\og 2 n.
that
if
inf^
[5 1 (m)/5 2 (m)]
=
0 and S^n)
< S 2 (n)
DSPACE^Jm)) 5 DSPACE(5 2 (m)).
for all «, then
12.3
HIERARCHY THEOREMS
|
299
However, if we do not have S x (n) < S 2 (n), then it is possible that DSPACEfS^n)) and DSPACE(S 2 (m)) each have languages not in the other.
A
time hierarchy
The is
deterministic time hierarchy
is
not as tight as the space hierarchy. The reason
TM which diagonalizes over all multitape TM's has some fixed number of To simulate a TM with a larger number of tapes we make use of the
that a
tapes.
two-tape simulation of a multitape
down.
Before
giving
TM,
thereby introducing a logarithmic slow-
construction
the
we
introduce
notion
the
of time
constructibility.
A
function T(n)
is
said to be time constructible
M
bounded multitape Turing machine such that on which actually makes T(n) moves. Just as
M
there
for
there exists a T(n) time-
if
each n there exists some input
for space-constructible functions
a rich hierarchy of time-constructible functions.
is
time-constructible
common
Again, most
Theorem
there
if
12.9
a
TM
that uses T(n) time
We say that
on
T(n)
is fully
inputs of length
all
n.
functions are fully time-constructible.
T2 (n)
If
is
a fully time-constructible function and
is
T2 (n) then there
Proof
a language in
is
The proof is
DTIME(T2 (m))
but not
DTIME(T (m)).
similar to that of Theorem 12.8,
necessary construction
is
given.
1
and only a
brief sketch of the
A T2 (n) time-bounded TM M is constructed to input w as an encoding of a Turing machine M
M treats the M on w. A difficulty arises because M has some fixed number of tapes, so for some w's M will have more tapes than M. Fortunately, by Theorem operate as follows.
and simulates 12.6,
only two tapes are needed to simulate any
M, although
the simulation costs a
M may have many tape symbols, which must be encoded into some fixed number of symbols, the simulation of T moves of M by M requires time cT a constant depending on M. log T (n\ where c In order to assure that the simulation of M time bounded, M simulT factor of log
T
x
(n).
Also, since
x
x
(n)
x
is
taneously executes steps of a
TM
time on
n.
all
inputs of length
constructible. After
(n)
is
2 (n)
(using additional tapes) that uses exactly
This
is
the reason that
T2 (n)
must be
fully
T2 (n) time
T2 (n) steps, M halts. M accepts w only if the simulation of M
is
M
M
is designed as in the previous completed and rejects w. The encoding of is a T (n) timetheorem, so each has arbitrarily long encodings. Thus, if bounded Turing machine, there will be a sufficiently large w encoding so that
M
M
cT
x
and the simulation
w
is
not
Therefore
in
(\w\) log
TidwD^T.dwl),
will carry to completion. In this case,
L(M). Thus L(M)
L(M)
is
in
x
M
^ L(M) for any DTIME(72 (m)) - DTI ME
(
M
that
7^ (h)).
w is is
in
L(M)
T (n) x
if
and only
if
time bounded.
COMPUTATIONAL COMPLEXITY THEORY
300
Example
Let T^n)
12.3
=
2
n
T2 (n) =
and
2
n 2
Til")
n
Then
.
n->oo
"
Thus Theorem 12.9 applies, and DTIME(2 ) + DTIME(n 2 2 n ). Since T^n) < n 2 n for all n, we may conclude that DTIME(2 ) £ DTIME(n 2 ). n
RELATIONS
12.4
T2 (n)
AMONG COMPLEXITY MEASURES
There are several straightforward relationships and one not-so-obvious relationship among the complexities of a given language L according to the four complexity measures we have defined. The straightforward relationships are stated in one theorem.
Theorem
12.10
L is in DTIME(/(n)), then L is in DSPACE(/(n)). If L is in DSPACE(/(m)) and f(n) > log 2 n, then there depending on L, such that L is in DTIME(c /(n)
a) If
b)
some constant
is
c,
).
c) If
L
that
in
is
Lis
NTIME(/(m)), then there
DTIME(V
in
is
some constant
c,
depending on
L, such
(n) ).
Proof a) If
TM M makes no more than f(n) moves,
cells
on any
tape.
By modifying
+
the storage requirements to \[f(n) b)
Observe that
if
space, then the s(n
n
+
>
2)f(n)t
i,cf{n)
TM M
{
has
+
2)f(n)t
M
t
f{n
M
2
log 2 n, there
l
at
cell
we can lower
most f(n).
x
with input of length n
some constant
a multitape
M
2
Mj
all
has not accepted when the count number of moves, Mi
is
never going to accept. Clearly
TM
be an /(h) time-bounded nondeterministic
The number of possible
ID's of
M
with
d fin)
>
s(f(n)
deterministic multitape
+
\
ft
TM
kf{n)
n
for all
can determine
M
if
M
2 is
s states,
t
c
f(n)
tape
given input of length
{
at
A
most
TM M 2 that uses one tape to count to c^^K If
.
most s(f(n)+ lft kf(n \ the product of the number of 3k satisfies positions, and tape contents. Thus d = s(t + \) is
at
halts without accepting. After this
symbols, and k tapes. n
is
c such that for
\
must have repeated an ID and so time bounded.
M
is
tape symbols, and uses at most/(n)
is
and two others to simulate
Let
t
M
Construct from
c)
and
s states
which
>
Since/(n)
.
>s(n
reaches c f{n \
l]/2\
number of different ID's of
f{n)
cannot scan more than /(«)+!
it
M to hold two symbols per
>
M
x
states,
head
1.
accepts input
w
of
that are accessible from the { of initial ID. This process can be carried out in time bounded by the square
length n by constructing a
list
of all the ID's of
12.4
RELATIONS AMONG COMPLEXITY MEASURES
|
301
the length of the list. Since the list of accessible ID's has length no greater than df(n) times the length of an ID, which can be encoded in 1 + k(f(n) + 1) symbols, the time is bounded by cf(n) for some constant c.
Theorem
theorem)
(SavitcKs
12.11
DSPACE(S 2 (n))
provided S(n)
M
Proof Let L = L(Mj), where For some constant c, there are
L
If
is
L
is
log 2
n.
NSPACE(S(n)), then
in
space constructible and S(n)
fully
is
>
in
an S(n) space-bounded nondeterministic TM. most c S(n) ID's for an input of length n. Thus, if Mj accepts its input, it does so by some sequence of at most cS(n) moves, since no ID is repeated in the shortest computation of x leading to acceptance. Let I f^- 1 2 denote that the ID I 2 can be reached from l by a sequence of at most 2 moves. For i > 1, we can determine if 7 1 (-^-/ 2 DV testing each V to see if - V and /' ~ l) I 2 Thus the space needed to determine if we can get from It one ID to another in 2' moves is equal to the space needed to record the ID /' currently being tested plus the space needed to determine if we can get from one ID to another in 2'~ moves. Observe that the space used to test whether one ID is reachable from another in 2'~ moves can be reused for each such test. is
t
at
M
x
t
1
(
'
.
|
1
1
The Fig. 12.5
details for testing
may
w
if
L(M j)
in
is
are given in Fig. 12.5.
be implemented on a Turing machine
of activation records! for the calls to
TEST. Each
The algorithm of
M
2 that uses a tape as a stack call has an activation record in
which the values of parameters / l5 I 2 and i are placed, as well as the value of local variable /'. As I l9 1 2 and V are ID's with no more than S(n) cells, we can represent each of them in S(n) space. The input head position in binary uses log n < S(n) cells. Note that the input tape in all ID's is fixed and is the same as the input to ,
begin let
w = |
|
let I Q
for
n and
be the
each if
final
TEST
m=
ID
Hog 2
ID
initial
of
cl;
M,
with input w;
of length at most S(n) do
If
(/ 0 , //,
mS(n)) then accept;
end;
TEST (I u 7 2 = 0 and (I = > then
procedure if
i'
if
/
{
/);
,
I2
or
lx
—
\
72)
then return true;
1
for each if
ID V
TEST
of length at most S(n) do
(7
l5
/', /
-
1)
and
TEST
(/', 7 2
,
i
-
1)
then
return true;
return false
end
TEST Fig. 12.5
t
An
"activation record"
is
Algorithm to simulate Mj.
the area used for the data belonging to one call of one procedure.
COMPUTATIONAL COMPLEXITY THEORY
302
M
we need not copy
The parameter i can be coded in Thus each activation record takes space 0(S(n)). As the third parameter decreases by one each time TEST is called, the initial = mS(n\ and no call is made when reaches zero, the maximum number call has 2 of activation records on the stack is 0(S(n)). Thus the total space used is 0(S (n)), 2 and by Theorem 12.1, we may redesign 2 to make the space be exactly S (n). so
2,
the input in each ID.
binary using at most mS(n)
cells.
i
i
M
Example
12.4
NSPACE(log
NSPACE(n Note
2 )
c DSPACE(m 4
than
fully
and using
that
12.5
Savitch's theorem holds even
this length
Observe, however, that then
n,
M
space constructible.
if
2
if
S(n)
is
space construc-
begins by simulating a
taking the largest
m,
TM M that
amount of space used
to lay out the space for the activation records.
we have no way
we cannot cycle through take too much space.
n)
NSPACE(2 n ) c DSPACE(4").
and
)
on each input of length
constructs as S(n)
>
that for S(n)
tible rather
c DSPACE(log 2
n)
all
2
of computing S(n) in even S (n) space,
possible values of I f or
/'
without getting some
TRANSLATIONAL LEMMAS AND NONDETERMINISTIC
HIERARCHIES we saw that the deterministic space and time hierarwould appear that corresponding hierarchies for nondeterministic machines would require an increase of a square for space and an exponential for time, to simulate a nondeterministic machine for diagonalization purposes. However, a translational argument can be used to give a much denser In
Theorems
12.8
and
12.9
chies were very dense.
It
hierarchy for nondeterministic machines.
A
translation
The
We
illustrate the
technique for space.
lemma
step is to show that containment translates upward. For example, sup2 happened to be true (which it is not) that NSPACE(n 3 ) c NSPACE(n ). 2 This relation could be translated upward by replacing n by n yielding
pose
first it
,
NSPACE(n
Lemma S 2 (n) >
12.2
Let
n and f(n)
S (n\ x
>
n.
S 2 (n\
6 )
<=
NSPACE(n
and f(n) be
fully
4 ).
space constructible, with
Then
NSPACE(Si(n))
<=
NSPACE(S 2 (n))
implies
NSPACE(S, (/(»))) c NSPACE(S 2 (/(w))).
/ 12.5
TRANSLATIONAL LEMMAS AND NONDETERMINISTIC HIERARCHIES
|
Proof
TM.
Let
L
be accepted by M,, a nondeterministic 5 t (/ (n)) space-bounded
x
Let
L2 = where $
M Si
2
is
{x$
M
l
* |
a new symbol not
as follows.
is
303
On
input x$
!
,
4- /)},
|
in the
M
2
|
Then L2
alphabet of
marks offS^ x |
+
\
M
i) cells,
accepted by a
is
which
Then 2 simulates M, on x, accepting more than 5 1 ( |x + i) cells. Clearly
fully constructible.
accepts without using
accepts x in space 5 1 ( x
t
if
M
|
it
and only
if
M
t
S^n) space
is
2
TM
may do, since
bounded. in NSPACE(S (/(/i))) and pad the L2 is in NSPACE(S («)). Now by the hypothesis that NSPACEfSjw)) ^ NSPACE(S 2 (n)), there is a nondeterministic S 2 {n) space-bounded TM M 3 accepting L 2 Finally we construct M 4 accepting the original set L within space S 2 (f(n)). M 4 marks offf(n) cells and then S 2 (f(n)) cells, which it may do since / and S 2 are fully constructible. As S 2 (n) > n,f(n) < S 2 (f(n)\ so M 4 has not used more than
What we have done
to take a set
is
L
v
1
padded version
strings with $'s so that the
1
.
t
S 2 (f(n)) cells. Next To do thi£, 4 on input x simulates 3 on x$' for i = 0, 1, 2, must keep track of the head location of 3 on x$\ If the head of 3 is within 4 's head is at the corresponding point on its input. Whenever the head of
M
M
.
M
moves into the is at most log
M
x,
M
3
M 4 records the location in a counter. The length of the counter
$'s, i.
during the simulation,
If
M
.
.
M
M 4 increases
M
accepts, then
3
M 4 accepts. M If
3
does not accept,
no longer fits on S 2 (f ( x )) tape cells. Then if halts. Now, is in then x x$ is in L2 for satisfying Si(|x| -h i) = L 4 u Thus the Si (/( x )). Since (n) > n, this equality is satisfied by i = /( x ) — x counter requires log (/( x ) - x ) space. Since S 2 (f( x )) > /( x ), it follows that the counter will fit. Thus x is in L(M 4 ) if and only if x$' is in L(M 3 ) for some then
M
i
until the counter
|
|
l
i
|
|
|
.
|
|
|
|
|
|
|
\
|
|
|
i.
Therefore
L(M 4 ) = L„ and L
x
is
in
NSPACE(S 2 (/(n))).
we can relax the condition that S 2 (n) > n, requiring only that provided that S 2 (f(n)) is fully space constructible. Then 4 can lay off S 2 (f(n)) cells without having to lay off f(n) cells. As S 2 (f(n)) > log/(n), Note
that
S 2 (n)
>
there
is still
log 2
M
n,
room
Essentially the for
for
M4
's
counter.
same argument
as in
Lemma
12.2
shows the analogous
results
DSPACE, DTI ME, and NTIME.
Example
12.5
Using the analogous translation result for deterministic time we £ DTIME(w2 n ). Note that this result does not follow
can prove that DTIME(2")
from Theorem
12.9, as
.
inf n-oo
2 " log 2"
—— =
n2
1.
5
304
COMPUTATIONAL COMPLEXITY THEORY
Suppose that
DTIME(w2 n ) s DTIME(2 n Then
letting
S^n) =
nl
n
S 2 (n)
,
=
n
DTIME(2 Similarly by letting /(rc)
Combining
=
+
n
and f(n)
2",
2
2
we
2"
).
we
,
2
get
").
(12.6)
get
n
2n
n
2 )2 2
we
obtain
DTIME((n +
2
n
c DTIME(2
")
DTIME((w + (12.6) with (12.7),
=
n
n
c DTIME(2 n2 2
)
2 )2 2
2n )
").
(12.7)
c DTIME(2 2 ").
(12.8)
However, 2
log 2
+
2 2 )2"2
(n
Thus Theorem
12.9
2"
2"
inf
implies
=
r„
n
that
'
inf
n
(12.8)
DTIME(w2 n ) c DTIME(2 n must be false. conclude that DTIME(2") £ DTIME(«2 n )
is
+
2
n
false,
Since
so
our supposition that
DTIME(2 n ) c DTIME(n2"), we
).
Example
The
12.6
translation
lemma can be used
to
show
that
NSPACE(w 3 ) is
4
properly contained in NSPACE(n Suppose to the contrary that NSPACE(h 4 )c=NSPACE(h 3 Then letting f(n) = n\ we get NSPACE(n 12 ) NSPACE(« 9 Similarly letting/(w) = n 4 we get NSPACE(n 16 ) c NSPACE(rc 12 and/(n) = n 5 gives NSPACE(n 20 c NSPACE(n Putting these together yields NSPACE(n 20 ) c NSPACE(n 9 However, we know by Theorem 12.11 that NSPACE(n 9 )c= DSPACE(n 18 and by Theorem 12.8, DSPACE(h 18 £ DSPACE(h 20 Thus combining these results, we get ).
).
,
).
),
!
)
).
).
),
)
).
NSPACE(n 20 c NSPACE(w 9 c DSPACE(n 18 )
£ DSPACE(n a
contradiction.
)
c NSPACE(n 20
)
NSPACE(w £ NSPACE(n 4
Therefore our assumption
wrong, and we conclude
A
)
20
NSPACE(n
3 )
4 )
),
s NSPACE(h 3
)
is
).
nondeterministic space hierarchy
Example space
Theorem
can be generalized to show a dense hierarchy polynomial range.
12.6
in the
12.12
If e
>
0 and
r
>
0,
NSPACE(n
then r )
S=
NSPACE(n r+£
).
for nondeterministic
12.5
TRANSLATIONAL LEMMAS AND NONDETERMINISTIC HIERARCHIES
|
Proof
If r
such that
r
integers s
any nonnegative and r + e > (s
is
<
real
+
s/t
and
number, we can find positive integers s and t Therefore it suffices to prove for all positive
1)1 1.
that
r,
NSPACE(n Suppose to the contrary
£ NSPACE(n (s+1)/
s/r
)
Lemma
').
that
NSPACE(h (s+1)/ ') Then by
=
12.2 with /(h)
n
(s+i) ',
<=
NSPACE(n
s/
').
we have
NSPACE(h (s+1)(s+,) ) c NSPACE(n s(s+i) for
i
=
0, 1,
.
.
.
,
s.
As
+
s(s
<
i)
NSPACE(h Using
(12.9)
and
(s
+
l)(s
(s+1)(2s) )
+ — iI
)
S NSPACE(n •
•
•
i
>
1,
(12.9)
)
we know
that
(s+1)(s+,_1)
(12.10)
).
we have
c NSPACE(k s(2s)
c
1) for
c NSPACE(h
s(s+,)
(12.10) alternately,
NSPACE(n
That
305
)
(s+ 1)(2s ~
c NSPACE(n s(2s " n
l
>)
c NSPACE(n (s+ 1)s c NSPACE(n s2 )
)
).
is,
NSPACE(« 2s2 + 2s c NSPACE(n s2 )
).
However, by Savitch's theorem,
NSPACE(w s2 ) c DSPACE(n 252 and by Theorem
),
12.8,
DSPACE(m 2s2 £ DSPACE(n 2s2 + 2s )
).
Clearly, + DSPACE(/7 2s2+2s ) c NSPACE(n 2s2 2s
Combining
these results,
we
get
NSPACE(n 2s2 + 2s ) 5 NSPACE(/7 2s2 + 2s a contradiction.
We
),
conclude that our assumption
NSPACE(n (s+1)/t ) c NSPACE(n was wrong. Since containment
in the
NSPACE(n for
).
any positive integers
s
and
t.
s/
')
s/
')
opposite direction
£ NSPACE(h
(s+
is
1)/r )
obvious,
we conclude
COMPUTATIONAL COMPLEXITY THEORY
306
Similar dense hierarchies for nondeterministic space can be proved for ranges
higher than the polynomials, and
Theorem
we
leave
some
of these results as exercises.
12.12 does not immediately generalize to nondeterministic time, because
of the key role of Savitch's theorem, for which
Theorem
a time analog of
no time analog is known. However,
12.12 has been established
by Cook [1973a].
12.6 PROPERTIES OF GENERAL COMPLEXITY MEASURES: THE GAP, SPEEDUP, AND UNION THEOREMS
we
In this section
some
discuss
While we prove them only
unintuitive properties of complexity measures.
for deterministic space complexity, they will be seen in
the next section to apply to all measures of complexity.
Theorems
and
12.8
12.9 indicate that the space
and time hierarchies are very
dense. However, in both theorems the functions are required to be constructive.
Can
this
The answer
condition be discarded?
no: the deterministic space and
is
time hierarchies have arbitrarily large gaps in them.
We is
often its
say that a statement with parameter n
(i.o.) if it is
Lemma
true
may
negation 12.3
If
be true
true almost everywhere
(a.e.) if it
L is
i.o.
accepted by a
TM M that
accepted by an S(n) space-bounded
Use
Proof
number
is
S(n) space
bounded
a.e.,
then
L is
TM.
the finite control to accept or reject strings of length n for the finite
of n where
effective, since in the
M
is
number of values of n. We say a statement is true infinitely for an infinite number of n's. Note that both a statement and
true for all but a finite
M
is
not S(n) bounded. Note that the construction
absence of a time bound we cannot
tell
is
not
which of these words
accepts.
Lemma 12.4 There is an algorithm to determine, given TM M, input length n, and integer m, whether m is the maximum number of tape cells used by on some
M
input of length
n.
M
may make Proof For each m and n there is a limit t on the number of moves on input of length n without using more than m cells of any storage tape or repeating an ID. Simulate all sequences of up to t moves, beginning with each input of length
Theorem g(n)
>
h,
n.
12.13 (Borodin s
Gap Theorem)
Given any
total recursive function
there exists a total recursive function S(n) such that
DSPACE(S(n)) —
DSPACE(#(S(n))). In other words, there is a "gap" between space bounds S(n) and g(S(n)) within which the minimal space complexity of no language lies. Proof
Let
number Si(n)
is
M M x
,
2 , ...
be an enumeration of TM's. Let
of tape cells used by
a total function and
M, on any
is
the
be the
maximum
M, always halts, then space complexity of M,, but if M, does not halt input of length
n. If
12.6
PROPERTIES OF GENERAL COMPLEXITY MEASURES
|
on some input of length
w,
then S,(«)
undefined.!
is
We construct S(n) so
307
that for
each k either
< S(n)
1)
S k (n)
2)
Sk (n)>g(S(n))
That
no Sk (n)
is,
or
a.e.,
i.o.
between S(n) and g(S(n))
lies
In constructing S(n) for a given value of finite set
of
between
1
TM's
Mu M
and n does
2
,
S,(w)
•
•
•
However, since some Instead, Si(n)
is
we
+
1
repeat the process. If
number of TM's under
=m
Si(n)
that for
for
< <
«,
i
1
and see
if
i
our attention to the
we could set S(n) equal to that we cannot compute the largest
then
there
M
is
consideration, and by
any fixed m, the process
< <
1
all n.
restrict
value.
value.
some in our finite set for which and g(j). If there is some such S,-(n), then set ; to S,(w) and not, set S(n) to j and we are done. As there is but a finite
initially set ;
between j
1
S,(n) are undefined,
=
almost
we
M„. The value for S(n) is selected so that for no i between S(n) and gr(S(w)). If we could compute the
,
lie
largest finite value of S,(w) for
for n,
n either
Lemma
will eventually
or S,(n)
>
g(j).
t
we can
12.4
tell
compute a value
whether such
for j
Assign S(n) this value of
Suppose there were some language L in DSPACE(#(S(n)) but not in DSPACE(S(h)). Then L = L(M k ) for some k where S k (n) < g(S(n)) for all n. By the construction of S(n), for all n > k, S k (n) < S(n). That is, S k (n) < S(n) a.e., and hence by Lemma 12.3, L is in DSPACE(5(m)), a contradiction. We conclude that DSPACE(S(n)) = DSPACE(^(5(n))).
Theorem 12.13 and its analogs for the other three complexity measures have a number of highly unintuitive consequences, such as the following. Example
There
12.7
is
a total recursive function f(n) such that
DTIME(/(n)) = NTIME(/(n)) = DSPACE(/(m)) = NSPACE(/(n)). DTIME(/(n)) is contained within NTIME(/(rc)) and DSPACE(/(r?)). both NTIME(/(«)) and DSPACE(/(«)) are contained within NSPACE(/(m)). By Theorem 12.10, for all/(n) > log 2 w, if L is in NSPACE(/(n)), Clearly
Similarly,
then there Therefore,
f(nY
(n)
is
a constant
c,
depending only on
L = L(M) for some TM By
a.e.
the
L
such that
L
M whose time complexity
DTI ME analog
of
Lemma
12.3,
L
is
is
in
DTIME(c/(n)
).
is
bounded above by
in
DTIME(/(M)/(n)
).
x
analog of Theorem 12.13 with #(x) = x establishes the exist{n) ence of/(n) for which DTIME(/(n)) = DTIME( f(nY \ proving the result. Similarly, if one has two universal models of computation, but one is very Finally, the
DTIME
simple and slow, say a Turing machine that makes one move per century, and the other is very fast, say a random-access machine with powerful built-in instructions for multiplication, exponentiation, t
We
identify
an undefined value with
and so on, that performs a million operations
infinity,
so an undefined value
is
larger than
any defined value.
COMPUTATIONAL COMPLEXITY THEORY
308
per second,
easily
it is
shown
that there exists a total recursive T(n) such that any
function computable in time T(n) on one
model
is
computable
in
time T(n) on the
other.
The speed-up theorem Another curious phenomenon regarding complexity measures is that there are functions with no best programs (Turing machines). We have already seen that every TM allows a linear speed up in time and compression in space. We now show that there are languages with no "best" program. That is, recognizers for these languages can be sped up indefinitely. We shall work only with space and show that there is a language L such that for any Turing machine accepting L, there always exists another Turing machine that accepts L and uses, for example, only the square root of the space used by the former. This new recognizer can of course be replaced by an even faster recognizer and so on, ad infinitum. The basic idea of the proof is quite simple. By diagonalization we construct L so that L cannot be recognized quickly by any "small" machine, that is, a machine with a small integer index encoding it. As machine indices increase, the diagonalization process allows faster and faster machines recognizing L. Given any machine recognizing L, it has some fixed index and thus can recognize L only so fast. However, machines with larger indices can recognize L arbitrarily more quickly.
Theorem tion.
12.14
There
Let r(n) be any total recursive func-
(Blums Speed-up Theorem) language
exists a recursive
L
such that for any Turing machine M,
Mj
accepting L, there exists a Turing machine Si(n), for
almost
Without
Proof
accepting
L
such that
r(Sj(n))
<
all n.
loss of generality
assume that
r(n)
ing fully space-constructible function with r(n)
a monotonically nondecreas-
is
>
n
2
(see Exercise 12.9). Define
h(n) by /i(l)
Then
h(n)
Let
is
=
h(n)
2,
=
r(h(n-
a fully space-constructible function, as the reader
Mb M
2
y
•
••
be an enumeration of
all off-line
Section 8.3 for single-tape TM's. In particular, length log 2 /. 1)
if
L(M,)
We =
construct
L
L, then S,(n)
>
h(n
-
i)
accepting
L
easily show.
we assume
that the
code
to that of for
M, has
a.e.;
Turing machine
Mj
such that
L{M }) = L
and
k).
The above conditions on L assure
Mj
may
TM's analogous
so that
2) for each k, there exists a Sj(n)
1)).
that for each
with
S
t
(n)
>
r(Sj(n))
a.e.
M, accepting L there exists an
12.6
To
Mj
see this, select
Si(n)
Now we
PROPERTIES OF GENERAL COMPLEXITY MEASURES
|
>
h(n
us construct
let
whether 0"
specify
<
so that Sj(n)
is
-
=
i)
h(n
r(h{n
L^
0* to
in
L. In
- i
- i
and
(2).
Then by
Af, exists.
(2),
> r(S»)
1))
satisfy (1)
By
1).
309
(1),
a.e.
For n
= 0,
M,
the process, certain
2, ... in turn,
1,
are designated as
A canceled TM surely does not accept L Let a(n) be the least integer — 1. < n such that Sj(n) < h(n — j), and Mj is not canceled by = 0, 1, When we consider n, if h(n — i) a.e. Let L(M.) = L. In constructing L, all TM's M jt for j < are ever canceled are canceled after considering some finite number of n\ say up "canceled."
j
i
.
i,
to n 0
Note
.
pose Si(n) is
canceled.
—
h(n
> max(n 0
some
for
i)
Thus a(n)
canceled. But a for n
computed, but nevertheless
that n 0 cannot be effectively
<
=
TM that ,
i),
that
i,
is is,
> max(n 0
n
and
i).
,
When we
consider
would be canceled had
A/,
it
n,
exists.
>
h(n
—
i)
i9
not been previously
canceled will surely not accept L. Thus S,(n)
S^n)
Sup-
MjJ <
no
>
h(n
—
i)
a.e.
To prove condition (2) we show that there exists, for given k, a TM M = M } L(M) = L, and Sj(n) < h(n — k)for all n. To determine whether 0" is in L, M must simulate M„ (n) on 0 n To know what o(n) is, M must determine which M,-'s have already been canceled by Of for f < n. However, constructing the list of canceled TM's directly requires seeing if M, uses more than /?(/" — i) space for 0 < f < n and 1 ^ < n. For < k + f — n this requires more than h(n — k)
such that
.
/
i
y
space.
The canceled
solution
is
incorporate into the list
<
of
all
finite
TM M h < k,
any
to observe that
when we consider some / control of
i
that
than a particular n
less
M whether Of
is
in L,
TM's M, canceled by any f < n v Thus no space
>
is .
x
ever canceled,
For each f
<
is
w,,
and also incorporate a if is needed by
M
at all
M
0", it will only be necescompute a(n) and simulate a(n) on sary to simulate TM's M, on input (/, where n < f < n and k < < n, to see whether M, is canceled by /. To test whether M, is canceled by /, we need only simulate M, using /?(/ — i) of M,'s cells, which is less than h(n — /c), as f < n and i > k. As n > it must be 0" takes that o(n), if it exists, is greater than k. Thus simulating input on n(n)
n
n v If n
to
i
x
M
h(n
—
o(n)) of
Lastly,
We
M
a{n) 's cells,
we must show
which that
M
need only simulate TM's M,
is
than h(n
less
—
k) cells.
can be made to operate within space h(n for k
<
i
<
n on inputs
(/,
<
/
< —
—
k).
to see
«,
whether they get canceled, so we need represent no more than h(n — k 1 ) cells of M,'s tape for any simulation. Since / < n the integer code for M, has length no more than log 2 n. Thus any tape symbol of M, can be coded using \og 2 n of M's y
> x 2 we know h(x) > 2 2x Also, 2n ' ' 2n ' h{n - k - 1). As 2 -k - l)] 2 > 2
cells.
As
r(x)
k
[h(n
by the definition of
.
,
x
sufficient for the simulation for
almost
all n.
k
~
l
>
\og 2 n
a.e.,
/7(«
/?,
-
h(n
—
k)
k) space
> is
COMPUTATIONAL COMPLEXITY THEORY
310
In addition to the space required for simulating the TM's, space
maintain the
list
code of length canceled
at
TM's
is
recognize words
The M.
control.
desired
is needed to most n TM's, each with a The n log n space needed to maintain the list of
of canceled TM's. This
most log 2 n.
also less than h(n 0",
—
where n \og 2 n
TM
resulting
is
list
consists of at
k) a.e.
By Lemma
<
—
h(n
k) or 2
12.3, ~
2 "~ k
of space complexity h(n
—
M can be modified to
1
<
log 2 n in
k) for all
h,
finite
its
and
the
is
The union theorem
The last theorem in this section, called the union theorem, has to do with thenaming of complexity classes. By way of introduction, we know that each polyno2
3
mial such as n or n defines a space complexity class (as well as complexity classes of the other three types). However, does polynomial space form a complexity class?
That
does there exist an S(n) such that DSPACE(S(n)) contains
is,
recognizable
in
bound and no other
a polynomial space
all sets
must must be small
sets? Clearly, S(n)
be almost everywhere greater than any polynomial, but
it
also
one cannot fit another function that is the space used by some TM and the polynomials, where "fit" must be taken as a technical term whose meaning is defined precisely in the next theorem.
enough so between
that
it
Theorem
Let {fj(n)
12.15
recursive functions. That
\
is,
=
1,
2,
there
is
i
.
a
.
.}
be a recursively enumerable collection of
TM that enumerates a
list
of TM's, the
computing/^, the second computing^, and so on. Also assume that n yfi( n )
Then
=
(J i>
1)
We
fi(n)
The
i,
is
i.o.,
first
i
first
and
DSPACE (/(>])).
1
construct a function S(n) satisfying the following two conditions:
For each
2) If Sj(n)
each
there exists a recursive S(n) such that
DSPACE(S(n)) Proof
for
S(n) >fi(n)
a.e.
some some n (and in
the exact space complexity of
>
then Sj(n)
S(n) for
TM M
}
and
for
each
fact, for infinitely
i,
Sj(n)
many
>
n's).
condition assures that
U
DSPACE(/(n))
ci
DSPACE(S(n)).
i
The second condition are in DSPACE(/(n))
assures that DSPACE(S(rc)) contains only those sets that for
some
/.
Together the conditions imply that
DSPACE(S(«))
=
(J DSPACE(y;(M)). i
condition each/j(n)
(2). i.o.
= fn (n)
would assure condition (1). However, it may not satisfy There may be a TM whose space complexity Sj(n) is greater than }
Setting S(n)
but
less
than fn (n) for
M
all n.
Thus
there
may be sets
in
DSPACE(/n (n))
12.6
not
in [ji
PROPERTIES OF GENERAL COMPLEXITY MEASURES
|
DSPACE( fi(n)). To
dips below each Sj(n) that
below
an
Sj(n) for
>
such that fi.(n)
overcome
is i.o.
infinity of ris.
problem we construct S(n) so that
it
greater than each / (n), and in fact, S(n) will dip f
This
done by guessing
is
The "guess"
Sj(n) a.e.
this
311
is
for each
TM Mj an
not nondeterministic; rather
it is
ij
subject
some point we discover that the guess is and for some particular n define S(n) to } Sj grows faster than any/j, S will infinitely
to deterministic revision as follows. If at
we
not correct,
be
than
less
guess a larger value for
Sj(n). If
often be less than Sj.
happens that
it
On
i
the other hand,
if
some/
almost everywhere greater
is
than Sp eventually we shall guess one such f and stop assigning values of S {
than
less
Sj.
we
In Fig. 12.6
give an algorithm that generates S(n).
A
list
LIST
called
of
=
k" for various integers j and k is maintained. For each j, there will be at most one guess k on LIST at any time. As in the previous theorem,
"guesses" of the form
Mu M
2 , ...
is
"i
-
7
an enumeration of
TM's, and Sj(n)
all off-line
amount of space used by Mj on any input of undefined (infinite) for some values of n.
length
n.
is
the
maximum may be
Recall that Sj(n)
begin
LIST:= empty list for«= 1,2, 3, ...do
1)
2) 3)
if
=
for all
add
4)
"i„
=
>
k" on LIST,/*(«)
n" to
LIST and
S» then
define S(n)
=fn (n)
else
begin
Among
5)
guesses on
all
LIST such
that
fk (n) <
guess with the smallest k and given that
k,
7
6)
define S(n)
7)
replace
8)
add
"!„
=
= fk (n)\ = /T by
n
to
=
Sj(n), let
=
A" be the
the smallest j;
n" on LIST;
LIST
end end Definition of S(n).
Fig. 12.6
To prove
that
DSPACE(S(*)) =
(J
DSPACE(y;(«))
i
we
first
show
that S(n) satisfies conditions (1)
see that for each m, S(n) lines (4)
and
value of S(n) S(n)
> fm (n)
(6) of Fig. 12.6. is
at least
> fm (n) except
S(n) defined at line
a.e.
and
Whenever
fm (n). Thus
S(n)
is
When
is
n reaches m,
defined at line (4) for n
LIST
will
To
(1).
assigned a value only at
for the values of S(n) defined at
for the finite set of n less than m. (6).
Consider condition
(2).
observe that S(n)
Now
>m
9
the
line (4),
consider the values of
have some
finite
number
of
COMPUTATIONAL COMPLEXITY THEORY
312
Each of these guesses may subsequently cause one value of S(n\ for some less than fm (n). However, when that happens, line (7) causes that guess to be replaced by a guess "ij = p" for some p > m, and this guess, if selected
guesses.
n
>
m, to be
does not cause S(n) to be made less than fm (n), since fp (n) > fm (n) whenever p>m. Thus from line (6) there are only finitely many n greater than m (at most the length of LIST when n = m) for which S(n) < fm (n). Since there are only a finite number of ris less than m, S(n) > fm (n) a.e. Next we must show condition (2), that if there exists such that for j each i, Sj(n) > f^n) i.o., then Sj(ri) > S(ri) for infinitely many ri. At all times after n = j, LIST will have a guess for ij9 and LIST is always finite. For n = j we n place "ij = j on LIST. As Sj(n) > fj(n) i.o., there will be arbitrarily many subsequent values of n for which the condition of step (3) does not hold. At each of — j n is selected at line (5), or some other one of the these times, either our finite number of guesses on LIST when n = j is selected. In the latter case, that guess is replaced by a guess H = q" with q > j. All guesses added to LIST are p n also of the form "/ = q" for q > j, so eventually our "f = j is selected at step (5), and for this value of n, we have Sj(n) > fj(n) = S(n). Thus condition (2) is true. at line (5),
TM M
4
4
.
Lastly
we must show
that conditions (1)
DSPACE(S(n))
=
[j
and
(2)
imply
DSPACE(/ («)). i
i
is in DSPACE(/^)). Then L is in DSPACE(/m (^2)) for some par(J By condition (1), S(n) >fm (n) a.e. Thus by Lemma 12.3, L is in DSPACE(S(«)). Now suppose that L is in DSPACE(S(rc)). Let L = L(M } \ where Sj(n) < S(n) for all n. If for no L is in DSPACE(/(m)), then by Lemma 12.3, for every i.o. Thus by condition (2) each TM k accepting L has S k (n) >fi(n) there is some n for which S k (n) > S(n). Letting k = j produces a contradiction.
Suppose L
f
ticular m.
i,
M
i,
Example
12.8
Mu M
...
halts.
2,
Let /(h)
Thus Theorem
=
ri.
Mh
such that
Then we may
12.15 says that there
DSPACE(S(n)) = As any polynomial the union over
next chapter
all
we
p(ri) is
12.7
equal to or
polynomials
shall call
intractable problems,
is
surely enumerate a sequence of
presented with input
p(ri),
some
is
(J
less
0",
writes
0"'
on
its
TM's
tape and
S(n) such that
DSPACE(«'). than some
ri a.e.,
DSPACE(5(n))
of DSPACE(p(/i)). This union, which
PSPACE, and which
is
in the
plays a key role in the theory of
thus seen to be a deterministic space complexity
class.
AXIOMATIC COMPLEXITY THEORY
The reader may have observed that many theorems in this chapter are not dependent on the fact that we are measuring the amount of time or space used, but only
12.7
that
we
are measuring
some resource
AXIOMATIC COMPLEXITY THEORY
|
that
is
313
being consumed as the computation
proceeds. In fact one could postulate axioms governing resources and give a
completely axiomatic development of complexity theory. In this section
we briefly
sketch this approach.
The Blum axioms Let Mj,
M
2 , ...
be an enumeration of Turing machines defining among them
every partial recursive function. For technical reasons
we
consider the M,'s as
partial recursive functions fa rather than as recognizing sets. The that it is notationally simpler to measure complexity as a function of the
computing reason
is
>,(rc) be the function of one ^(h), Q> 2 ( n )i .... be a set of partial recursive the following two axioms (Blums axioms).
input rather than of the length of the input. Let variable
computed by
functions satisfying
Axiom
1
Axiom
2
O (n)
is
f
The
M
f
defined
,
and
if
function R(i,
let
and only
if
$,(n)
m) defined
n,
is
to be
defined.
1 if (rc) t
= m and 0 otherwise, is a
total recursive function.
function O (n) gives the complexity of the computation of the ith Turing machine on input n. Axiom 1 requires that ,(m) is defined if and only if the ith Turing machine halts on input n. Thus one possible O, would be the number of steps of the ith Turing machine. The amount of space used is another alternative,
The
f
we define the space used to be infinite if the TM enters a loop. Axiom 2 requires that we can determine whether the complexity of
provided
the ith
Turing machine on input n is m. For example, if our complexity measure is the number of steps in the computation, then given i, n, and m, we can simulate M, on 0" for m steps and see if it halts. Lemma 12.4 and its analogs are claims that Axiom 2 holds for the four measures with which we have been concerned.
Example 12.9 Deterministic space complexity satisfies Blum's axioms, provided we say ,(«) is undefined if M, does not halt on input 0", even though the amount of space used by M, on 0" may be limited. Deterministic time complexity likewise satisfies the axioms if we say O («) is undefined whenever M, runs forever or halts without any 0 on its tape. To compute R(i, n, m), simply simulate M, for m steps f
7
on input 0 n
.
We may
and space satisfy the axioms if means for an NTM to compute a = j if and only if there is some function. For example, we might say that sequence of choices by M, with input 0" that halts with 0 on the tape, and no sequence of choices that leads to halting with some 0\ k =f= j, on the tape. If we define O.(m) = >,(«), we do not satisfy Axiom 2. Suppose R(i', n, m) were m recursive. Then there is an algorithm to tell if M, with input 0" halts with 0 on its tape. Given any TM M, we may construct to simulate M. If halts with any
we make an
establish that nondeterministic time intelligent definition of
what
it
7
M
M
COMPUTATIONAL COMPLEXITY THEORY
314
M erases own tape. If an index for M, then 0) M halts on input Thus R(i, m) were recursive, we could M halts on a given input, which undecidable (see Exercise tape,
its
R(i, n,
is
i
0".
if
n,
if
is
Recursive relationships
is
true
if
tell if a
and only
given
TM
8.3).
among complexity measures
Many of the theorems on complexity can be proved
solely from the two axioms. In complex functions, the speed-up theorem, the gap theorem, and the union theorem can be so proved. We prove only one theorem here to illustrate the techniques. The theorem we select is that all measures are recursively related. That is, given any two complexity measures Q> and , there is a total recursive function r such that the complexity of the TM M, in one measure, 6,(rz), is at most r(rc, ,(rc)). For example, Theorems 12.10 and 12.11 showed that for the four measures of complexity with which we have been dealing, at most an exponential function related any pair of these complexity measures. In a sense, functions that are easy in one measure are "easy" in any other measure, although the term "easy" must be taken lightly, as r could be a very rapidly growing function, such as Ackermann's function. particular, the fact that there are arbitrarily
Theorem
Let
12.16
and
6
be two complexity measures. Then there exists a
recursive function r such that for
all
/,
>
r(n,
Proof
0>,(n)
a.e.
Let r(n,
m)
= max
{i>i(n) \i
and
O^rc)
=
m).
i
The
function r
is
recursive, since
,(«)
= m may be tested by Axiom 2.
and ,(h) must be maximum can be computed. Clearly r(n,
equal to m, then
>,(/?)
n
>
a
more general framework,
i,
r(n,
defined,
>
by Axiom
1,
Should it be and hence the
n
>
i,
since for
^(n).
Although the axiomatic approach it
fails
intuitive notion of complexity. If
is
elegant and allows us to prove results in
to capture at least one important aspect of our
we
construct a Turing machine
M
k
that
first
M
on the result, we would expect the com} as great as M, on n. However, there are complexity plexity of k on n to be at least measures such that this is not the case. In other words, by doing additional computation we can reduce the complexity of what we have already done. We leave the construction of such a complexity measure as an exercise.
executes
M, on
n and then executes
M
EXERCISES
—
12.1 The notion of a crossing sequence the sequence of states in which the boundary between two cells is crossed was defined in Section 2.6 in connection with two-way finite automata. However, the notion applies equally well to single-tape TM's. Prove the follow-
—
ing basic properties of crossing sequences.
EXERCISES
The time taken by
a)
single-tape
TM M
w
on input
sum
the
is
315
of the lengths of the
crossing sequences between each two cells of M's tape.
M
TM
Suppose is a single tape on which its input was
b)
that, if
the crossing sequence between W! and
M
,
{wcw R w
a)
|
is in
12.1 to
w2
show
that
same
if
M accepts input Wi w
as that between
>
0.
Thus
the
bound of Theorem
The notion
*12.3
{wcw w |
is in
+
(a
b)*}
heads.
Theorem
12.5
is
some constant
TM's
if
we
replace the
contents of storage tapes, and positions of the storage tape
state,
12.8, the
n, for
a sense the best possible.
in
of crossing sequences can be adapted to off-line
notion of "state" by the
,
.
2 each require kn steps on some input of sufficiently large odd length
k
2 and x t and x 2 when
that the languages
b)
(a -f b)*}
w2
x
input, does so to the right of the
its
Show
the
is
M accepts x
given input x t x 2 then
Use Exercise
*12.2
S
is
accepts
it
originally written.
cells
space hierarchy, applied only to space complexities of log n or
above. Prove that the same holds for
fully space-constructible S 2 (n) below log n. [Hint: Using a generalized crossing sequence argument, show that {wc'w w is in (a + b)* and |
w = |
2 S2
is
in
|
DSPACE(S 2 (n))
DSPACE^m)).]
but not in
Show, using generalized crossing sequence arguments, that if L is not a regular set in DSPACE(S(n)), then S(n) > log log n i.o. Show the same result for nondeterministic space. Thus for deterministic and nondeterministic space there is a "gap" between 1 and log log n.
*12.4
and
L
is
Show
12.5
that
Lemma
lemma," applies to
12.2, the "translation
a) deterministic space
b) deterministic time, and c)
nondeterministic time.
Show
12.6 12.7
that
Show
that
What,
if
DTlME(2 2
"
+n )
NSPACE((c +
properly includes e)
n )
DTIME(2 2 ").
properly includes
NSPACE(c")
for
any
c>
1
and
c>0. 12.8
any,
is
the relationship between each of the following pairs of complexity
classes? a)
b) c)
d)
DSPACE(n 2 and DSPACE(/(m)), where/(n) = DTIME(2") and DTIME(3 n NSPACE(2 n and DSPACE(5 n DSPACE(n) and DTIME([log 2 Ml n )
n for
odd n and n 3
for
even
n.
)
)
)
)
Show
any total recursive function, then there is a fully space-constructible 2 monotonically nondecreasing r' such that r'(n) > r(n), and r'(x) > x for all integers x. computing r.] [Hint: Consider the space complexity of any 12.9
that
if r is
TM
12.10
Show that there is a total recursive function S(n) such that L n space-bounded TM, for c > 1. if L is accepted by some c
is
in
DSPACE(S(n))
if
and only 12.11
Suppose we used axioms
for
computational complexity theory as it pertains to is, let 2 ... be an enumeration of Turing x
M M
languages rather than functions. That
machines and
Axiom
1':
Li the
O^m)
>
language accepted by is
defined
Reprove Theorem 12.16
if
for
and only
Axioms
1'
M if
M,
and
,
Replace Axiom
t .
halts 2.
on
all
1
by:
inputs of length
n.
COMPUTATIONAL COMPLEXITY THEORY
316
12.12
Show
[Hint:
Use Theorem
Show
12.13 a) n
a)
12.16
b) 2"
Show
c) n\
that the following are fully space constructible:
Jn
b) log 2 M
Some
function that
c log 2 log 2 n, for
Show
12.15
that
some
if
bounded above by log 2 log 2 « and
is
>
c
T2 (n)
then there
T2 (n) T2 (n)
move
M
the description of tell
Show
T2 (n)
that
if
{
so that
TM M
simulate a one-tape
but by no t
T (n) Y
by a one-tape
always near the tape head. Similarly, carry
it is
when M, has exceeded
along a "counter" to
TM,
time-bounded one-tape
To
time-bounded one-tape machine. [Hint: machine,
bounded below by
is
time constructible, and
is
a language accepted by a
is
that
0, infinitely often.
n-oo
*12.16
NTIME.
and space constructible:
that the following are fully time
2
12.14
Sc)
and gap theorem hold for NSPACE, DTI ME and and the speed-up and gap theorems for DSPACE.]
that the speed-up
its
time
limit.]
time constructible and
is
T2 (n) then for
all k,
there
a language accepted by a fc-tape
is
TM, where
time-bounded
/c-tape Ti(n)
ithms base 2 of
m
to get to
\og*(m)
is
the
72
or below. For example, log*(3)
1
time-bounded
number
=
2
TM
but by no
we must take
of times
and log*(2 65536 )
=
logar-
5.
Note
that this exercise implies Exercise 12.15.
Show
*12.17
no
any complexity measure
that for
/such
total recursive function
complexity of a function
12.18
that O.(n)
terms of
in
its
The speed-up theorem implies language
find a
L
for
which there
M,
such that the space used by
satisfying the (pi(n)).
the
using
list
sequence of TM's
at least r
a)
b)
= =
O (n) = t
^
.
,
d) OJn)
=
then the
list
number of
maximum number
10
all
if
/'
and
din)
is
state
M M
of
2
,
.
.
TM's
is
•
,
we can
each accepting
M
we merely proved
for every
i
+
,.
L,
However, we
that
it
TM accepting L, there
must exist. an M, on
is
not recursively enumerable.
measures?
changes made by M, on input
the
0 for
of
if
the following are complexity
the
there can be
one cannot bound the
is,
applied to the space used by
not effective, in that
less space,
Which of
*12.19
is
Blum axioms
value.
did not give an algorithm for finding such a sequence;
Prove that speed up
That
that for arbitrarily large recursive functions r
exists a
is
O
< f(n,
n.
moves made by M, without a
state
change on input
n.
n.
defined,
'
(undefined otherwise. *12.20
(Honesty theorem for space).
for every space
and S(n) *12.21
is
complexity class
computable
Theorem
12.7
in r(S(n))
shows
Show
r
6\ there
that there is
is
a total recursive function
r
such that
a function S(n) such that DSPACE(5(m))
=#
space.
that given
S(n\ there
is
a set
L such
that
any
TM
recognizing
EXERCISES
L uses more than S(n) space i.o. Strengthen this result TM recognizing L uses more than S(n) space a.e. **12.22
Let
when
(j>i(n)
to
show
there
is
E such
a set
317
any
that
O be a complexity measure and let c(Uj) be any recursive function such that and j(n) are defined then so is C (ij)(n). Prove there exists a recursive function h
such that *c(«)
Show
**12.23
that
if
f(n)
is
<
M". *>iW
H»))
a e
-
space constructible then DTIME(/(>z) log/(n))
fully
c
DSPACE(/(>z)). Exhibit a
12.24
TM
that accepts an infinite set containing
12.25 Consider one-tape TM's symbol on the tape. a)
that use a constant
Prove a linear "speed-up" theorem
no
infinite regular subset.
one unit of ink each time they change a
for ink.
b) Give an appropriate definition of a "fully ink-constructible" function. c)
How much
of an increase in the
amount of ink
is
necessary to obtain a
new complexity
class?
12.26 A Turing machine is said to be oblivious if the head position at each time unit depends only on the length of the input and not on the actual input. Prove that if L is accepted by a /c-tape T(n) time-bounded TM, then L is accepted by a 2-tape T(n) log T(n)
oblivious
*12.27
TM. Lc
Let
(0
+
1)* be the set accepted
by some T(n) time-bounded
TM. Prove
that
most xn T(n) log T(n) two-input gates and producing output 1 if and only if the values of x u correspond to a string in L. The values of x u x n correspond to the string x if x, has value true whenever the z'th symbol of x is 1 and x, has value false whenever the ith symbol of x is for
each n there exists a Boolean
with inputs x,,
circuit,
x„,
having
at
.
.
0.
[Hint: Simulate an oblivious
Loop programs
**12.28
statement 1)
is
.
.
.
.
,
,
TM.]
consist of variables that take on integer values
and statements.
A
of one of the forms below.
(variable) := (variable)
2) (variable) 3) for
i
:=
1
:= (variable) + 1 do statement;
to (variable)
4) begin (statement); (statement);
...
(statement) end;
is bound before the loop, as in PL/I. programs always terminate. that every loop program computes a primitive recursive function. that every primitive recursive function is computed by some loop program. that a TM with a primitive recursive running time can compute only a
In (3) the value of the variable
Prove b) Prove c) Prove d) Prove a)
that loop
primitive recursive function.
F be a formal proof system TM's. Define a complexity class
12.29
Let
C T{n) = Can
in
which we can prove theorems about one-tape
{L(Mi) there exists a proof |
in
F
that T^n)
<
T(n) for
all n}.
the time hierarchy of Exercise 12.16 be strengthened for provable complexity? [Hint:
Replace the clock by a proof that T^n)
<
T(n).]
1
COMPUTATIONAL COMPLEXITY THEORY
318
Solutions to Selected Exercises
Consider any string wcw R of length n and
12.2(a)
sequence between positions and /'
s states.
for
1,
Suppose the average of / W>1 over
which tw
<
i
<
S wJ
least half of all w's,
for
+
i
let
n/2,
i
£ wA be the length of the crossing
words w of length
all
the
M with
made by some one-tape TM (n
-
l)/2
Then
is p(i).
for at
The number of w's is 2 {n ~ 1)/2 so there are at least 2 {n ~ 3)12 number of crossing sequences of length 2p(i) or less is
2p(i).
As
2p(i).
< <
1
,
w's
2p(i)
£ ^+1 {n there must be at least 2 ~
tions i
+
i
and
+
i
through
1
1.
(n
3)/2
/s
There are 2 (n "
-
2p(,)+
1)/2
sequences of as and
words, so
l)/2 in these
same crossing sequence between
w's with the
"'
,
posi-
in positions
if
?
wl>2 "-'" ,
s
may appear
b's that
2-
i
(12.U)
Then two words with the same crossing sequence differ somewhere among positions. Then by Exercise 12.1(b), accepts a word it should not accept. Thus (12.11) is false, and s 2p(i)+ > 2}~ Therefore,
the
first
i
M
1
l
.
^^2Tog^-2 Surely there
some word w such
is
average time. By Exercise <"-J>/
/
(""J,)/
We may
12.14(c)
whether
2
/
that does not divide
>
\og 2 k.
As k
n
1
As the
n.
\
<
k
k
,
/
test
whether
we know
i
In
-
3
\
-
/n
n
1 \
of space complexity S(n) to i,
M
,
takes at least
at least
is
1
by each
such that
wcw R
presented with
1
TM M
divisible
is
the logarithm of the largest S(n)
-
design an off-line
input length n
its
average
12.1(a), this
2
when
that
-
test for
1
i
=
2, 3»
—
stopping as soon as we encounter a value of
divides n needs only log 2 / storage
2, 3,
/
all
divide
n. If
we
let
n
cells,
S(n)
is
= kl we know that
that
<
log 2 n
k \og 2 k
and log 2 log 2 «
Thus
<
\og 2 k
for those values of n that are k
!
for
S(n)
We must show that for all n for which S(n) least 2
2k_1 .
That
is,
/
is
n S(n) 7
<
1
log 2 log 2 /c
some
k,
show.
2 log 2 /c.
follows that
\ log 2 log 2 n.
+
log 2 log 2 /7.
It
suffices to
show
that the smallest
k which is the least common multiple (LCM) of 2, 3, 2 ~ we need the fact that LCM(2, 3, ...,i) > 2 i_1 A proof requires 1
.
.
.
,
.
that
a prime
we
is
+
1, is
at
results in
are not prepared to derive, in particular that the probability
asymptotically 1/ln
Hardy and Wright [1938]). Since LCM(2, 3, between 2 and a lower bound on the order of e' /,
it
<
k,
numbers
the theory of that integer
>
>
+
z,
where i)
for
is
In is
the natural logarithm (see
product of the primes
at least the
LCM (2, 3,
.
.
.
,
i)
for large
i
is
easy to
BIBLIOGRAPHIC NOTES
319
BIBLIOGRAPHIC NOTES The study of time complexity can be said to begin with Hartmanis and Stearns [1965], where Theorems 12.3, 12.4, 12.5, and 12.9 are found. The serious study of space complexity begins with Hartmanis, Lewis, and Stearns [1965], and Lewis, Stearns, and Hartmanis [1965]; Theorems 12.1, 12.2, and 12.8 are from the former. Seiferas [1977a,b] presents some of the most recent results on complexity hierarchies. A number of earlier papers studied similar aspects of computation. In Grzegorczyk [1953], Axt [1959], and Ritchie [1963] we
Yamada [1962] studies the class of real-time computRabin [1963] showed that two tapes can do more than one in real time, a result that has since been generalized by Aanderaa [1974] to k versus k — 1 tapes. Theorem 12.6, showing that logarithmic slowdown suffices when one goes from many tapes to two is from Hennie and Stearns [1966]. Theorem 12.11, the quadratic relationship between deterministic and nondeterministic time, appears in Savitch [1970]. Translational lemmas were pioneered by Ruby and Fischer [1965], while Theorem 12.12, the nondeterministic space hierarchy, is by Ibarra [1972]. The nondeterministic time hierarchy alluded to in the text is from Cook [1973a]. The best nondeterministic hierarchies known are found in Sieferas, Fischer, and Meyer [1973]. Book and Greibach [1970] characterize the lan-
find hierarchies of recursive functions.
able functions [T(n)
guages
in
(J c>0
The study
=
n].
NTIME(cn). of abstract complexity measures originates with
Blum
[1967].
Theorem
gap theorem, is from Borodin [1972] and (in essence) Trakhtenbrot [1964]; a stronger version is due to Constable [1972]. (Note that these and all the papers mentioned in this paragraph deal with Blum complexity measures, not solely with space, as we have done.) Theorem 12.14, the speed-up theorem, is from Blum [1967], and the union theorem is from McCreight and Meyer [1969]. Theorem 12.16, on recursive relationships among complexity measures, is from Blum [1967]. The honesty theorem mentioned in Exercse 12.20 is from McCreight and Meyer [1969]. The simplified approach to abstract complexity used in this book is based on the ideas of Hartmanis and Hopcroft [1971]. Crossing sequences, discussed in Exercises 12.1 and 12.2, are from Hennie [1965]. The generalization of crossing sequences used in Exercises 12.3 and 12.4 is developed in Hopcroft and Ullman [1969a], although Exercise 12.4 in the deterministic case is from Hartmanis, Lewis, and Stearns [1965]. Exercise 12.14(c) is from Freedman and Ladner [1975]. Exercise 12.16, a denser time hierarchy when TM's are restricted to have exactly k tapes, is from Paul [1977]. Exercise 12.18, showing that speed up cannot be made effective, is from Blum [1971]. Exercise 12.22 is from Hartmanis and Hopcroft [1971]. Exercise 12.23 is from Hopcroft, Paul, and Valiant [1975]. See also Paul, Tarjan, and Celoni [1976] for a proof that the method of Hopcroft et ai cannot be extended. Oblivious Turing machines and Exercises 2.26 and 2.27 are due to M. Fischer and N. Pippenger. Loop programs and Exercise 2.28 are from Ritchie [1963] and Meyer and Ritchie [1967]. 12.13, the
CHAPTER
13
INTRACTABLE PROBLEMS
In Chapter 8
computer. In
some so
we discovered that one can pose problems that are not solvable on a this chapter we see that among the decidable problems, there are
difficult that for all practical
generality on a computer.
Some
purposes, they cannot be solved
proved to require exponential time very strong that exponential time
is
for their solution.
For others the implication
required to solve them
;
if there
of solving them than the exponential one, then a great
problems
in
in their full
of these problems, although decidable, have been
mathematics, computer science, and other
were a
faster
is
way
number of important
fields
— problems which — could be for
good solutions have been sought in vain over a period of many years solved by substantially better means than are now known.
13.1
POLYNOMIAL TIME AND SPACE
The languages recognizable
polynomial time form a natural and DTIME(n'), which we denote by It is an intuitively appealing notion that & is the class of problems that can be solved 51 efficiently. Although one might quibble that an n step algorithm is not very efficient, in practice we find that problems in 0* usually have low-degree polynoimportant
class, the class
in deterministic
(J,->,
mial time solutions.
&
but There are a number of important problems that do not appear to be in efficient nondeterministic algorithms. These problems fall into the class r U,->i NTIME(n'), which we denote by J &. An example is the Hamilton circuit problem: Does a graph have a cycle in which each vertex of the graph appears exactly once? There does not appear to be a deterministic polynomial time algo-
have
320
13.1
POLYNOMIAL TIME AND SPACE
|
rithm to recognize those graphs with Hamilton nondeterministic algorithm
indeed form a Hamilton
circuits.
However, there
is
321
a simple
guess the edges in the cycle and verify that they
;
do
circuit.
&
between and Jf& is analagous to the difference between proof of a statement (such as "this graph has a Hamilton circuit") and efficiently verifying a proof (i.e., checking that a particular circuit is Hamilton). We intuitively feel that checking a given proof is easier than finding
The
difference
efficiently finding a
one, but
Two
we
don't
know
this for a fact.
other natural classes are
PSPACE =
DSPACE(rt')
(J i>
1
and
NSPACE =
NSPACE(n').
(J i>
1
Note that by Savitch's theorem (Theorem 12.11) PSPACE = NSPACE, since NSPACE(h') c DSPACE(« 2 '). Obviously, & c jf& c PSPACE, yet it is not known if any of these containments are proper. Moreover, as we shall see, it is unlikely that the mathematical tools needed to resolve the questions one way or the other have been developed.
Within
PSPACE we
have two hierarchies of complexity 2
DSPACE(log
n)
£ DSPACE(log
NSPACE(log
n)
£ NSPACE(log 2
classes:
3
n)
£ DSPACE(log
n)
5 NSPACE(log 3
n)
£
n)
£
•
•
and
Clearly
DSPACE(log*
U
n)
c NSPACE(log*
NSPACE(log
/t
n)
=
k>l
n)
and thus by Savitch's theorem
DSPACE(log
,c
(J k>l
n).
Although one can show that 0>
+
IJ
DSPACE(log
fc
n),
k>l
containment of either class
in the
DSPACE(log
other n)
and at least one of the containments by the space hierarchy theorem. Bounded
is
c& is
unknown. Nevertheless
c X9> c PSPACE,
proper, since
DSPACE(log n) 5 PSPACE
reductibilities
Recall that in Chapter 8 we showed a language L to be undecidable by taking known undecidable language L and reducing it to L. That is, we exhibited
mapping g computed by
a
TM that always halts, such that for all strings x, x
is
a a in
INTRACTABLE PROBLEMS
322
L if and only if g(x) is in L. Then if L were recursive, L could be recognized by computing g(x) and deciding whether g(x) is in L By restricting g to be an easily computable function, we can establish that L is or
not
is
some
in
class
such as
two types of
particularly in
We
space reducibility.
say that
polynomial-time bounded in
L
and only
if
Lemma a)
b)
L L
Let
13.1
is
in
is
in
x
if
if
1
.
&
L
if
TM
PSPACE. We shall be interested polynomial time reducibility and log-
JfzP, or
reducibility:
L
is
polynomial-time reducible to
that for each input
there
is
a is
if
in L.
is
L
be polynomial-time reducible to L. Then
L
is
is
L
x produces an output y that
in
Ar &,
in J*.
and (b) are similar. We prove only (b). Assume that the bounded and that L is recognizable in time p 2 (rc), where p x and p 2 are polynomials. Then L can be recognized in polynomial time as follows. Given input x of length m, produce y using the polynomial-time reduction. As the reduction is Pi{n) time bounded, and at most one symbol can be printed per move, it follows that \y\ < P\(n). Then, we can test if y is in L in time p {p\{n)). Thus the 2 total time to tell whether x is in L is Pi(n) -f p 2 {P\{n)), which is polynomial in n.
The proofs of
Proof
reduction
Therefore,
A
(a)
Pi(n) time
is
L
is
in P.
log-space transducer
is
an
off-line
TM
that always halts, having log n
scratch storage and a write-only output tape on which the head never
We
say that
L
log-space reducible to
is
L
if
there
given input x, produces an output string y that
Lemma
13.2
If
L
is
is
is
in
L
if
is
in
&
b)
L
is
in
NSPACE(log*
n)
if
L
is
in
NSPACE(log*
n),
c)
L
is
in
DSPACE(log*
n)
if
L
is
in
DSPACE(log*
n).
if
L
left.
and only
if
x
is
in L.
log-space reducible to L, then
L
a)
moves
a log-space transducer that
is in
Proof a) It suffices to
show
more than polynonote that the output
that a log-space reduction cannot take
mial time, so the result follows from
Lemma
13.1(b). In proof,
tape contents cannot influence the computation, so the product of the states, storage tape contents,
number
and positions of the input and storage tape heads
of is
an upper bound on the number of moves that can be made before the log-space transducer must enter a loop, which would contradict the assumption that it always halts. If the storage tape has length log /?, the bound is easily seen to be polynomial
in n.
There is a subtlety involved in the proofs of (b) and proof of (b) being essentially the same as for (c).
(c).
We prove only (c), the
13.1
Mj
c) Let
POLYNOMIAL TIME AND SPACE
|
be the log-space transducer that reduces
space bounded
M
L to L, and
TM accepting L. On input x of length
M
n,
x
323
let 2 be a log* n produces an output
of length bounded by n c for
some constant c. Since the output cannot be written cannot be simulated by storing the output of on x 2 x a tape. Instead the output of symbol a at a time. x can be fed directly to 2 in log* n space,
M
and
M
M
M
M
M
,
M
M
2 moves right on its input. Should 2 move left, must be restarted to determine the input symbol for 2 since the output of
This works as long as
M
not saved.
is
We
construct
input position of
number can be
M M
3
L
to accept
2 in
as follows.
One
M
Mj
until
or right,
by
M
= Mj
1
i
2
M
and
M
assume that
M
holds the
.
this
,
M
M
M
the state and storage tapes of
M
M
2
from the beginning, and waits has produced i — 1 or i + 1 output symbols if 2 's input head moved left respectively. The last output symbol produced is the new symbol scanned
2
M
2
accepts
2
M
3 is
moves
halts before
ever
3
The other storage tapes of 3 simulate the Suppose at some time 2 's input head is at position
M
head, so
2 's
x
M
.
and 2 makes a move left or right. 3 adjusts accordingly. Then 3 restarts the simulation of
M
and
storage tape of
base 2 C Since the input position cannot exceed n c
stored in log n space.
storage tapes of z,
2
Mj
,
ready to simulate the next
left,
we assume
producing
i
+
1
that
M
2
M
move
of
output symbols (when
simulated input. Thus
M
next scans the
next scans the right endmarker. its
t
M
3
2
.
As
left
M
2
special cases,
if
endmarker, and
if
moves
M
3
is
a log* n space bounded
accepts
its
own
we
right),
input when-
TM
accepting L.
Lemma tions
is
Proof
The composition of two log-space
13.3
(resp.
polynomial-time) reduc-
a log-space (resp. polynomial-time) reduction.
An
easy generalization of the constructions
in
Lemmas
13.1
and
13.2.
Complete problems
As we have mentioned, no one knows whether
,Ar 0*
so the issue of proper containment
One way
j\f g>
—
^
is
to look for a "hardest"
hardest problem
if
is
open.
problem
every language in
Ar &
is
includes languages not in to find a language in
in „V'&. Intuitively,
a language Lq
is
a
reducible to Lq by an easily comput-
Depending on the exact kind of reducibility, we can conclude For example, if all of J is log-space reducible to Lq, we can conclude that if L0 were in ^, then & would equal . r &. Similarly, if Lq were in DSPACE(log n\ then ^V0> = DSPACE(log n). If all of were polynomialtime reducible to Lq, then we could still conclude that if Lq were in ^, then ^ would equal but we could not conclude from the statement Lq is in able reduction.
certain things about Lq.
\
A
DSPACE(log
We see
n) that
Jf& = DSPACE(log
n).
from the above examples that the notion of "hardest" may depend on the kind of reducibility involved. That is, there may be languages Lq such that all
INTRACTABLE PROBLEMS
324
languages in Jf0> have polynomial-time reductions to Lq, but not
all
have
log-
space reductions to Lq. Moreover, log-space and polynomial-time reductions do not exhaust the kinds of reductions we might consider. With this in mind, we define the notion of hardest (complete) problems for a general class of languages
with respect to a particular kind of reduction. Clearly the following generalizes to
an arbitrary type of reduction. Let ^ be a class of languages.
We say language L is complete for <6 with respect
to polynomial-time (resp. log-space) reductions
polynomial-time
is
if
L is
and every language
in
We
(resp. log-space) reducible to L.
say
L
is
hard for
if
every language in
polynomial-time
L
is
Two
L
is
NP -complete
log-space reductions.! (hard) for
PSPACE
#
is
in
(NP-hard)
L
is
PSPACE-complete (PSPACE-hard)
if
L
is
complete
with respect to polynomial time reductions.
show a
In order to
not necessarily
primary importance, and we introduce shorthands for if L is complete (hard) for with respect to
special cases are of
them.
but
^
with
respect to polynomial-time (resp. log-space) reductions (resp. log-space) reducible to L,
in
language Lq to be NP-complete, we must give a
first
log-space reduction of each language in Jf0> to Lq.
complete problem Lq, we
may prove another
Once we have an NPto be NPL in
language
x
L l9 since the composition of by Lemma 13.3. This same
complete by exhibiting a log-space reduction of Lq to
two log-space reductions
is
a log-space reduction
technique will be used for establishing complete problems for other classes as
well.
SOME MP-COMPLETE PROBLEMS
13.2
significance of the class of NP-complete problems is that it includes many problems that are natural and have been examined seriously for efficient solutions. None of these problems is known to have a polynomial-time solution. The fact
The
that
if
any one of these problems were
in
& all would be, reinforces the notion that
if a new problem proved NP-complete, then we have the same degree of confidence that the new
they are unlikely to have polynomial-time solutions. Moreover, is
problem
The
is
hard that we have for the classical problems. problem we show to be iVP-complete, which happens to be
first
ically the first
such problem,
is
satisfiability for
Boolean expressions.
histor-
We begin by
defining the problem precisely.
The
A
satisfiability
problem
Boolean expression
is
an expression composed of variables, parentheses, and the (logical OR) and — (negation). The precedence of
operators a (logical
AND), v
these operators
highest, then
and t
1
Many
(true); so
is
—
i
do
expressions. If
i
a then v Variables take on values 0 (false) E and E 2 are Boolean expressions, then the x
authors use the term "./VP-complete" to
time reductions," or
in
some
.
,
mean "complete
cases, "with respect to
for
with respect to polynomial
polynomial time Turing reductions."
13.2
SOME JVP-COMPLETE PROBLEMS
|
325
E a E 2 is 1 if both E and E 2 have value 1, and 0 otherwise. The value of E v E 2 is 1 if either E or E 2 has value 1, and 0 otherwise. The value of ~i E is 1 if E is 0 and 0 if E is 1. An expression is satisfiable if there is some assignment of O's
value of
x
x
x
l
x
x
x
and Vs to the variables that problem
gives the expression the value
The
1.
to determine, given a Boolean expression, whether
is
satisfiability
satisfiable.
is
it
We may represent the satisfiability problem as a language L^ at as follows. Let the variables of some expression be x lf x 2 ...,x m for some m. Code x as the ,
symbol x followed by
i
written in binary.
t
The alphabet of L sat
{a, v,-n,(,),x,0,
The
1}.
length of the coded version of an expression of n symbols
be no more than
\n log 2 «l, since
word
the
word
itself were
of length
n.
is
each symbol other than a variable
symbol, there are no more than \n/2] different variables n, and the code for a variable requires no more than shall henceforth treat the
thus
is
in L^ at representing
Our
is
easily seen to
coded by one
an expression of length + riog 2 nl symbols. We
in 1
an expression of length n as
if
depend on whether we use n or log n) < 2 log n, and we shall deal
results will not
n log n for the length of the word, since
log(>?
with log-space reductions.
A
Boolean expression is said to be in conjunctive^ normal form (CNF) if it is of called a clause (or conjunct), is of the form E a E 2 a a Ek and each v a fr where each a u is a literal, that is, either x or — x, for some a v a i2 v variable x. We usually write x instead of ~i x. For example, (x y vx 2 )a (x x v x 3 v x 4 ) a x 3 is in CNF. The expression is said to be in 3-CNF if each clause has exactly three distinct literals. The above example is not in 3-CNF because the first and third clauses have fewer than three literals. the form
-
•
,
• •
i
.,
f ,
Satisfiability is
We
•
x
•
/VP-complete
begin by giving a log-space reduction of each language
Theorem Proof
The
13.1
The easy
satisfiability
problem
part of the proof
expression of length n
is
is
To show
that JU a
,
is
in
r
&
t^ to L^,.
X0>. To determine
satisfiable, nondeterministically
that every language in j\
J
NP-complete.
is
variables and then evaluate the expression.
in
Thus L^, is
is
guess values for
if
all
an the
in
reducible to
L^,
for
each
NTM M
time bounded by a polynomial p(n\ we give a log-space algorithm that takes as input a string x and produces a Boolean formula E x that is satisfiable if
that
is
M
if accepts x. We now describe E x Let #/? 0 #/?i# •• #/?p n) be a computation of M, where each ft is an ID consisting of exactly p(n) symbols. If acceptance occurs before the p(n)lh move, we allow the accepting ID to repeat, so each computation has exactly p(n) + 1 ID's.
and only
.
(
t "Conjunctive" tive is similarly
is
an adjective referring to the logical
applied to logical
OR.
AND operator (conjunction). The term disjunc-
INTRACTABLE PROBLEMS
326
ID we group the state
In each
with the symbol scanned to form a single composite
symbol. In addition, the composite symbol
move by which
indicating the
moves by
assigned to
make
the
+
(i
ID
in the zth
ID
l)st
contains an integer
arbitrarily ordering the finite set of
m
Numbers are choices that may
follows from the
ith.
M
given a state and tape symbol.
For each symbol that can appear in a computation and for each i, 0 < i < + l) 2 we create a Boolean variable c ix to indicate whether the zth symbol in the computation is X. (The Oth symbol in the computation is the initial #.)The (p(n)
,
Ex
expression c ix
s
we
that
and only
if
if
The expression E x 1 )
The C ix
C iX
shall construct will
the c ix
states the following:
that are true correspond to a string of symbols, in that exactly one
s
true for each
is
be true for a given assignment to the
that are true correspond to a valid computation.
s
i.
ID
M with
2)
The ID
3)
The
4)
Each ID follows from the previous one by the move of
is
f} 0
ID
last
an
initial
The formula E x
+
l)
2
-
is
AND
the logical
The
exactly one
1,
input x.
contains a final state.
the above conditions. (p(n)
of
first
C ix
M that
true
indicated.
of four formulas, each enforcing one of
formula, stating that for each is
is
i
between 0 and
is
For a given value of the term \/ x C iX forces at least one C iX to be true and \J x + y (Qv a Ct) forces at most one to be true. Let x = a a 2 " a n The second formula expressing the fact that f$ 0 is an i
~~
1
m
-
l
initial
i)
ii)
ID
is
in
turn the .
c 1Vl
v
c, y ,
v- v c
iii)
i
in state
positions 0 and p(n)
in
1
are #.
,
,
x
,
,
.
x
correct.
f\ 2
-
The 2nd through
v ) A«<'
The
T ne
/7th
symbols of
remaining symbols of
third formula says that the last
fi 0
ID
written
is
the set of composite
l)
1)2
are correct.
has an accepting
\
state. It
v
V F
/i 0
are blank.
,
p{n)(p(n)+
where
+
where Y,, Y2 Yk are all the composite symbols that iyjk symbol a the start state q 0 and the number of a legal move of q 0 reading symbol a This clause states that the first symbol of/?0
represent tape
is
of the following terms.
Ac plB)+1>t The symbols
c 0#
M
AND
X
VF in
symbols that include a
final state.
can be
13.2
|
SOME JVP-COMPLETE PROBLEMS
327
To see how to write the fourth formula stating that each ID ft, i > 1, follows from ft_ by the move appearing in the composite symbol of ft_ l5 observe that we can essentially deduce each symbol of ft from the corresponding symbol of ft _ ! and the symbols on either side (one of which may be #). That is, the symbol j
same as the corresponding symbol in ft _ j unless that symbol had the and move, or one of the adjacent symbols had the state and move, and the move caused the head position to shift to where the symbol of ft in question was. Note that should this symbol of ft be the one representing the state, it also represents an arbitrary legal move of M, so there may be more than one legal symbol. Also note that if the previous ID has an accepting state, the current and in ft is the
state
previous ID's are equal.
We
can therefore easily specify a predicate f(W, X, Y, Z) that is true if and symbol Z could appear in position j of some ID given that W, X, and Y are the symbols in positions j — I, j, and j + 1 of the previous ID [W is # if j = 1 and Y is # if / = p(n)]. It is convenient also to declare /(H7 #, X, #) to be true, so we can treat the markers between ID's as we treat the symbols within ID's. only
if
,
We
can
now
express the fourth formula as
A
V
(
(p(n)
+
1)2
W.AW.Zsueh
\
that
f(W,X,Y,Z)
M
computation of on x to find truth values for c ix true if and only if the ith symbol of the computation is X. Conversely, given an assignment of truth values making E x true, the four formulas above guarantee that there is an accepting computation of on x. Note that even though is nondeterministic, the fact that a move choice is incorporated into each ID guarantees that the next state, symbol printed, and direction of head motion going from one ID to the next will all be consistent with some one choice of M. Furthermore, the formulas composing E x are of length 0(p 2 (n)) and are sufficiently simple that a log-space TM can generate them given x on its input. The TM only needs sufficient storage to count up to (p(n) + l) 2 Since the logarithm of a polynomial in n is some constant times log n, this can be done with 0(log n) storage. We have thus shown that every language in NP is log-space reducible to L^,, proving that L^, is NP-complete. It is
easy, given an accepting
the c lV 's that
make E x
true. Just
make
M
M
.
We
have just shown that
satisfiability
for
Boolean expressions
is
NP-
complete. This means that a polynomial-time algorithm for accepting L, M could in „V'&. Let L be the language accepted by some time-bounded nondeterministic Turing machine M, and let A be the logspace (hence polynomial-time) transducer that converts x to £ v where E x is
be used to accept any language
p(n)
,
satisfiable
if
and only
if
M accepts
x.
Then A combined with
the algorithm for L^ a
,
.
INTRACTABLE PROBLEMS
328
Algorithm
—
Algorithm for L sat
constructing
Ex
from x
Deterministic polynomial time algorithm for arbitrary language
Fig. 13.1
in^V.y
Algorithm for arbitrary
set
L
in
Jf&
given algorithm for
L sa
,
as shown in Fig. 13.1 is a deterministic polynomial-time algorithm accepting L Thus the existence of a polynomial-time algorithm for just this one problem, the satisfiability of Boolean expressions, would imply 0> = jV0>.
Restricted satisfiability problems that are /VP-complete
Recall that a Boolean formula logical
k-CNF
in
(y
v z)
is
in
conjunctive normal form
(CNF)
if it is
the
AND of clauses, which are the logical OR of literals. We say the formula is
We
in
if
each clause has exactly k
literals.
For example,
(x
is
v y) a(x vz)a
2-CNF.
now consider two languages, Lcsat the set of satisfiable Boolean CNF, and L 3sat the set of satisfiable Boolean formulas in 3-CNF. We give log-space reductions of L^, to Lc sat and Lcsat to L 3sal showing the latter two problems NP-complete by Lemma 13.3. In each case we map an expression to another expression that may not be equivalent, but is satisfiable if and only if the shall
formulas
,
in
,
,
original expression
Theorem
13.2
is
Lt sat
,
satisfiable.
the satisfiability problem
for
CNF
expressions,
is
NP-
complete. since I* at is. We reduce to as follows. Let E Proof Clearly Lt sat is in be an arbitrary Boolean expression of length n.| Certainly, the number of variable occurrences in E does not exceed n, nor does the number of a and v operators. Using the identities
-i(Ei
-i(E i
we can transform E
1
aE 2 = -i(E )v-i(E 2 vE 2 = -!(£,) a^(E 2
\E^
may
)
),
(13.1)
),
= Eu
—
i
operators are
more complex expressions. The validity
be checked by considering the four assignments of values 0 and
t Recall that the length of a
and
1
to an equivalent expression £', in which the
applied only to variables, never to (13.1)
)
Boolean expression
recall that this difference
is
is
the
number of characters, not
of no account where log-space reduction
is
of Eqs. 1
to£i
the length of its code,
concerned.
13.2
£2
and
Incidentally, the
.
SOME JVP-COMPLETE PROBLEMS
|
two of these equations are known as
first
329
DeM organ's
laws.
The transformation can be viewed formations. As a result of the
first
as the composition of two log-space trans-
transformation, each negation symbol that
immediately precedes a variable is replaced by a bar over the variable, and each closing parenthesis whose matching opening parenthesis is immediately preceded by a negation sign is replaced by )r~. The symbol r~ indicates the end of the scope of a negation. This first transformation is easily accomplished in log-space using a counter to locate the matching parentheses.
The second transformation
automaton that scans (modulo 2 sum) of the active negations, those whose immediately following opening parenthesis but not closing parenthesis has been seen. When the parity of negations is odd, x is replaced by x, x by x, v by a and a by v The symbols — and r— are deleted. That this transformation is correct may be proved using (13.1) by an easy induction on the length of an expression. We now have an expression E in which all the input from
is
accomplished by a
finite
to right, keeping track of the parity
left
.
,
i
negations are applied directly to variables.
Next we create £", an expression in CNF that is Let V and V2 be sets of variables, with V
satisfiable.
is
V We
and v
£', all
\E'
's
=
|
in
n,
x
an expression
then there
is
a
r,
the
is
number of a 's
of whose negations are applied to variables, that
of at most n clauses,
list
E'
if
x
prove by induction on
shall
.
and only
.
x
an extension of an assignment
agree on the variables of
if
c V2 We say an assignment of values to V if the assignments
x
V2
of values to
satisfiable
F
F2
l9
FkJ
,
if
over a set of
variables that includes the variables of E' and at most n other variables, such that £'
is
given value
by an assignment to
1
extension of that assignment that Basis
r
=
0.
Then
E!
is
a
literal,
satisfies
variables
its
F a F2 a
•
•
•
x
if and only a Fk
if
there
is
an
.
and we may take that
literal in
a clause by
itself
to satisfy the conditions.
E=
E aE 2
Fk and G ,G 2 let F u F 2 G be the clauses for by the inductive hypothesis. Assume without loss of generaappears both among the F's and that no variable that is not present in
Induction
If
E and E 2
that exist
x
lity
x
,
,
,
X
z
E
FkJ G G 2 Then F u F 2 G satisfies the conditions for E. If E = E x v £ 2 let the F's and G's be as above, and let y be a new variable. yw Fk j/v G l9 yvG 2 .., j/vG satisfies the conditions. Then yv F i9 yv F 2 In proof, suppose an assignment of values satisfies E. Then it must satisfy E or £ 2 If the assignment satisfies E then some extension of the assignment satisfies among
the G's.
,
.
.
.
,
x
,
,
.
.
,
.
t
,
,
,
,
.
z
x
.
F
x
,
x ,
F2
,
Fk Any .
further extension of this assignment that assigns y
E. Conversely, suppose
satisfy all the clauses for suffices.
ment.
£2
is
If that
the clauses for
A
£2
the assignment satisfies
,
=
0
will
a similar argument
all the clauses for E are satisfied by some assignG must be satisfied, so y = 1, then all of G u G 2 similar argument applies if y = 0. The desired expression E" is all
assignment
satisfied.
If
E
sets
connected by a 's.
,
.
.
.
,
z
INTRACTABLE PROBLEMS
330
To
above transformation can be accomplished in log-space, cony be the variable introduced by the ith v The final expression is the logical AND of clauses, where each clause contains a literal of the original expression. In addition, if the literal is in the left subtree of the ith v then the clause also contains y If the literal is in the right subtree of the ith v then the clause contains y\. The input is scanned from left to right. Each time a literal is encountered, a clause is emitted. To determine which y^s and y 's to include in the clause, we use a counter of length log n to remember our place on the input. We then scan the entire input, and for each v symbol, say the ith from the left, we determine its left and right operands, using another counter of length log n to see that the
sider the parse tree for E. Let
.
k
,
,
.
t
;
count parentheses.
the current literal
If
is
in the left
operand, generate y
t ;
if it is
in
the right operand, generate y h and if in neither operand, generate neither y nor y have thus reduced each Boolean expression £ to a expression E" .
t
that
is
f
CNF
We
L^,
in
if
and only
if
E
in L^ at
is
.
Since the reduction
accomplished
is
log-space, the AfP-completeness of Z^ at implies the ATP-completeness of Le Sal
Example
in
.
Let
13.1
—1(—l(x
E=
x
V X 2 ) A (—IX! V X 2 )).
Applying DeMorgan's laws yields
E= The transformation
=
E"
to
(xj
13.3
L 3sat
Clearly,
L 3sal
Theorem
CNF
vx 2 )v(x AX 3
(Xi
t
introduces variables y Y and y 2 to give
vy,v y 2 a )
(x 2
v y v y2) a
the satisfiability
,
).
x
problem
(x x
for
v y2 ) a
(x 3
3-CNF
v y2 )
expressions,
is
NP-
complete.
Proof
expression. Suppose
is
in jV'sP, since L^ at
some Pi
clause F, has
=
ai
Introduce new variables y u y 2 (a,
v a2 v y
) ,
a
(a 3
v y v y2 ) a l
is.
E = P a F2 a
Let
v a 2 v " v fyy *
y^_
,
(a 4
a
3,
three
•
literals,
•
a Fk be a
CNF
say
£ >7>.
and replace F by
v y 2 v j/ 3 ) a (a,- 2
•
x
more than
t
• •
•
v jv_ 4 v y( _ 3 ) a
(a, _
{
v
v yf _ 3 ).
(13.2)
if and only if an extension of that assignment Thus assignment satisfying F- must have a 7 = 1 for some a 7 _! the value 0 and gcj the assume that the assignment gives literals o^, a 2 value 1. Then ym = 1 for m < j - 2 and ym = 0 for m >j - 1 is an extension of the
Then
F,
is
satisfied
satisfies (13.2).
by an assignment
An
-
,
assignment satisfying
(13.2).
13.2
|
SOME WP-COMPLETE PROBLEMS
331
we must show that any assignment satisfying (13.2) must have some j and thus satisfies F Assume to the contrary that the assignment the a m 's the value 0. Then since the first clause has value 1, it follows that
Conversely, (Xj
=
1
for
gives all
=
f
.
1, y 2 must be 1, and by induction, ym = 1 But then the last clause would have the value 0, contradicting the assumption that (13.2) is satisfied. Thus any assignment that satisfies (13.2) also
yx
1.
Since the second clause has value
for all m.
satisfies F,.
when F consists of one or two v a 2 v y) a (a v a 2 v y), where y is a x former case an introduction of two new variables suffices.
The only other literals.
alterations necessary are
In the latter case replace
ol
v a 2 by
£
(o^
t
new variable, and in the Thus E can be converted to a 3-CNF expression that is satisfiable if and only if E is satisfiable. The transformation is easily accomplished in log-space. We have thus a log-space reduction of L^, to L 3sat and conclude that L 3sat is NPcomplete.
The vertex cover problem turns out that 3-CNF satisfiability is a convenient problem to reduce to other problems in order to show them NP-complete, just as Post's correspondence problem is useful for showing other problems undecidable. Another NP-complete problem that is often easy to reduce to other problems is the vertex cover problem. Let G = (V, E) be an (undirected) graph with set of vertices V and edges E. A subset A ^ V is said to be a vertex cover of G if for every edge (v, w) in £, at least one of v or w is in A. The vertex cover problem is: Given a graph G and integer /c, does G have a vertex cover of size k or less? We may represent this problem as a language L^, consisting of strings of the form: k in binary, followed by a marker, followed by the list of vertices, where v is represented by v followed by i in binary, and a list of edges, where (v h Vj) is consists of all represented by the codes for v and Vj surrounded by parentheses. such strings representing k and G, such that G has a vertex cover of size k or less. It
t
{
Theorem Proof all
L^, the vertex cover problem,
13.4
To show L^ c in Jf&, guess may be done in time
edges. This
problem representation. satisfiability to U, c
Let
F=
clause of the form (a n
G=
directed graph 1
3.
The
2)
[('">./)>
(*\
(K, E)
vertex
*)]
[(UM*. 01
covers
to be
NP-complete by reducing 3-CNF
be an expression in 3-CNF, where each F, is a being a literal. We construct an un), each
va l2 va l3
(i,
whose
vertices are pairs of integers (Uj),
j) represents the jth
the graph are !)
it
proportional to the square of the length of the
.
aF 2 a--- AF q
Fj
NP-complete.
a subset of k vertices and check that
shown
is
is
provided; if «i;
=
+
-»«*•
k,
and
literal
of the
ith clause.
1
< < i
q,
The edges of
INTRACTABLE PROBLEMS
332
Each pair of vertices corresponding to the same clause are connected by an edge in (1) Each pair of vertices corresponding to a literal and its complement are connected by an edge in (2). G has been constructed so that it has a vertex cover of size 2q if and only if F is satisfiable. To see this, assume F is satisfiable and fix an assignment satisfying F. Each clause must have a literal whose value is 1. Select one such literal for each clause. Delete the q vertices corresponding to these literals from V. The remaining .
vertices
form
(i,
form a vertex cover of size 2q. Clearly for each i, only one vertex of the missing from the cover, and hence each edge in (1) is incident upont
j) is
edges in (2) are incident upon two vertices and its complement, and since we could not have deleted both a literal and its complement, one or the other of these vertices is in the cover. Thus we indeed have a cover of size 2q. Conversely, assume we have a vertex cover of size 2q. For each the cover must contain all but one vertex of the form (Uj), for if two such vertices were missing, an edge [(i,j), (i, k)] would not be incident upon any vertex in the cover. For each i assign value 1 to the literal a {j corresponding to the vertex not in the cover. There can be no conflict, because two vertices not in the cover cannot correspond to a literal and its complement, else there would be an edge in group (2) not incident upon any vertex of the cover. For this assignment F has value 1. Thus F is satisfiable. The reduction is easily accomplished in log-space. We can essentially use the variable names in the formula F as the vertices of G, appending two bits for the j-component in vertex Edges of type (1) are generated directly from the clauses, while those of type (2) require two counters to consider all pairs of literals. Thus we conclude that is NP-complete. at least
one vertex
in the cover. Since
corresponding to some
literal
i
Example
Consider the expression
13.2
F=
(Xj
V x2 v X3 ) A
(Xj
The construction of Theorem x3
=
1,
x4
=
0
satisfies
F and
[2, 3], [3, 1], [3, 3], [4, 1],
The Hamilton
circuit
and
v x 2 v x4 ) a
(x 2
v x3 v x5 ) a
(x 3
V x 4 V x 5 ).
13.4 yields the graph of Fig. 13.2. x corresponds to the vertex cover [1, 2], {
=
1,
x2
=
1,
[1, 3], [2, 1],
[4, 3].
problem
The Hamilton circuit problem is: Given a graph G, does G have a path that visits each vertex exactly once and returns to its starting point? The directed Hamilton circuit problem is the analogous problem for directed graphs. We represent these and Iy h by encoding graphs as
problems as languages problem. t
An edge
(i\
w)
is
incident
upon
v
and
w and no
other vertices.
in the vertex
cover
13.2
Graph constructed by Theorem
Fig. 13.2
Double
Theorem
L^,
13.5
guess a
in jV'Pf,
simple cyclef through
all
3-CNF
Ldh
satisfiability to
13.4.
is
NP-complete.
of arcs and verify that the arcs form a
list
the vertices.
To show
Iy h
iVP-complete,
is
we reduce
.
F = F aF 2 a-- aF q
Let
333
circles indicate vertices in set cover.
the directed Hamilton circuit problem,
To show L dh
Proof
SOME NP-COMPLETE PROBLEMS
|
x
be an expression
in
3-CNF, where each
F,
is
a
x„ be the clause of the form (a n va i2 va l3 ), each a l7 being a literal. Let x u construct a directed graph G that is composed of two types of variables of F.
We
subgraphs. For each variable 13.3(a),
H-s
where m,
is
a subgraph //, of the form shown in Fig. number of occurrences of x, and 5c in F. The shown in Fig. 13.3(b). That is, there are arcs from
x, there is
the larger of the
are connected in a cycle, as
f
!, for 1 < i < n and an arc from d n to a Suppose we had a Hamilton circuit for the graph of Fig. 13.3(b). We may as well suppose it starts at a If it goes next to b XOy we claim it must then go to c 10 else c 10 could never appear on the cycle. In proof, note that both predecessors of c 10 are already on the cycle, and for the cycle to later reach c 10 it would have to repeat a vertex. (This argument about Hamilton circuits occurs frequently in the proof. We shall simply say that a vertex like c 10 "would become inaccessible") Similarly, we may argue that a Hamilton circuit that begins a ,b x0 must continue
d to a i+
x
t
x
.
.
,
x
^i2> C! 2 ... If the circuit begins a x , c 10 then it descends the ladder of Fig. 13.3(a) in the opposite way, continuing b 10 , c x ,, b x x , c X2 , ... Likewise we may
c ioy ^1
1'
c
\
,
i'
,
H
argue that when the circuit enters each is fixed; c l0 but then its path through
H
,
cu
i>
an ^
t
t
A
makes x
f
simple cycle has
false.
in
turn
it
may go from
former case
it
a,
to either b i0 or
descends by the arcs
case by the arcs b i} c iJ+ .ln what follows, it helps go from a to b i0 as making x, true, while the opposite With this in mind, observe that the graph of Fig. 13.3(b) has
in the latter
to think of the choice to
choice
(
in the
no repeated
x
i
vertex.
INTRACTABLE PROBLEMS
334
(O
Graphs concerned with directed Hamilton
Fig. 13.3
exactly 2
n
Hamilton
circuits that
correspond
in
a natural
circuits.
way
to the 2
n
assign-
ments to the variables of F. For each clause F } we introduce a subgraph l p shown in Fig. 13.3(c). /,-has the properties that if a Hamilton circuit enters it at r it must leave at u } if it p enters at sp it must leave at vy, and if it enters at tj9 it must leave at vv,-. In proof, suppose by symmetry that the circuit enters Ij at ry \
case
1
The next two
continue with leaves at u }
.
vv,,
and
vertices if it
on the
leaves at
vv,.
circuit are Sj
or vj9 Uj
is
and
t
.
}
Then the circuit must Thus in this case it
inaccessible.
1
13.2
The next two
case 2
next go to
on the
vertices
SOME NP-COMPLETE PROBLEMS
|
circuit are Sj
and
then Uj will be inaccessible. If after u }
ttj,
appear on the
because
circuit
its
it
Vj. If
335
the circuit does not
goes to
wp
vertex
successors are already on the circuit.
t
}
cannot
Thus
in this
case the circuit also leaves by ur
The
case 3 include
goes directly to u} If
circuit
because
tjt
.
its
it
next goes to
successors are already used. So again
the circuit cannot
vv,, it
must leave by
Uj.
Observe that the above argument holds even though the circuit may enter once. Finally, the graph /, has the additional property that entering rj9 s p or tjy it can traverse all six vertices before exiting.
more than at
To
complete the construction of the graph, connect the 7/s to the
77,'s
Ij
I
}
as
Suppose the first term in F, is x,. Then pick some c ip that has not yet been connected to any I k and introduce an arc from c ip to r} and from u} to b ifP + j. If the first term is x h pick an unused b ip and introduce arcs b ip -+rj and Uj-*c itP+1 Make analogous connections with Sj and Vj for the second term of Fj9 and analogous connections with t} and Wj for the third term. Each 77, was chosen sufficiently long that enough pairs of 6,/s and c /s are available to make all the connections. If the expression F is satisfiable, we can find a Hamilton circuit for the graph as follows. Let the circuit go from a to b i0 if x,- is true in the satisfying assignment, and from a to c i0 otherwise. Then, ignoring the 7 y's, we have a unique Hamilton circuit for the subgraph of Fig. 13.3(b). Now, whenever the constructed circuit uses an arc b ik -» c k+ or c ik -> b itk + u and b ik or c, respectively, has an arc to an Ij subgraph that has not yet been visited, visit all six vertices of I p emerging at follows.
.
f
t
{
fc ,
i
Ci,k+i or b iM+lJ respectively. traverse l } for all j.
Conversely,
we must show
satisfiable. Recall that in left
at Uj,
or
Vj,
wj9
fact that
respectively.
N excursions
Ij
F
is
satisfiable implies that
that the existence of a
any Hamilton
cerned, connections to an c ik~~* bi.k+itraverse the
The
Thus
circuit
an
l
}
Hamilton entered at
circuit implies rj9 sj9
as far as paths through the
look like arcs
in parallel
to the 7/s are ignored,
it
we can
or
t
77,'s
with an arc b ik
-*>
i
F is
must be are con-
c Lk+
follows that the circuit
j
or
must
one of the 2 n ways which are possible without the 7/s; that is, it may follow the arc a -> b i0 or a -> c i0 for 1 < i < n. Each set of choices determines a truth assignment for the x^s. If one set of choices yields a Hamilton circuit, including the 7/s, then the assignment must satisfy all the clauses. For example, if we reach Ij from b ik in the circuit, then x, is a term in F; and it must be that the circuit goes from a to c l0 which corresponds to the choice x, = 0. Note that if the circuit goes from a to b i0 then it must traverse b k+l before c k + and we could 7/,'s in
x
x
-,
,
{
t
not traverse
,
between b ik and c ifk+l
i
i
x
could never be included
in
remark, we must prove we have a log-space reduction. Given F,
we
Ij
,
as b ik+l
the circuit.
As a
last
and arcs of 77, simply by counting occurrences of x, and x We can list the connections between the 77,'s and 7/s easily as well. Given a term like x, in Fj, we can find a free pair of vertices in 77, to connect to Ij by counting can
list
the vertices
.
f
INTRACTABLE PROBLEMS
336
F
occurrences of x in f
l9
F2
Fj- V As no count gets above the number of
,
variables or clauses, log n space
Example
Let
13.3
F
is
the length of F.
is
be (Xj
V X2 V X3 ) A
(Xj
The graph constructed from F by Theorem circuit
where n
sufficient,
corresponding to the assignment Xj
V x 2 v x 3 ).
shown
13.5
is
=
x2
1,
=
0,
A Hamilton = 0 is drawn in heavy
in Fig. 13.4.
x3
lines.
Finally
we show
that the
Hamilton
Theorem
Lh
13.6
,
the
Hamilton
problem
circuit
ing the directed Hamilton circuit problem to circuit
is
ATP-complete by reduc-
it.
problem
for undirected graphs,
is
NP-
complete.
Proof To show that Lh is in J/'&, guess a list of the edges and verify that they form a Hamilton circuit. To show Lh iVT-complete we reduce Ldh to it. Let G = (V, E) be a directed graph. Construct an undirected graph G' with vertices 0 v u and v 2 for each v in V, and edges ,
t?
y o> v i)
f° r
eacn y n
2) (f i, ^2)
f° r
eacn y n
1)
(
3) (^2»
w o)
if
*
^
an ^ only
,
K K
if
v ->
w
is
an arc
in E.
Each vertex in K has been expanded into three vertices. Vertices with subscript 1 have only two edges, and since a Hamilton circuit must visit all vertices, the subscript of the vertices in any Hamilton circuit of G' must be in the order 0, 1, 2, 0, 1, ... or its reverse. Assume the order is 0, 1, 2, ... Then the edges whose subscript goes from 2 to 0 correspond to a Hamilton circuit in G. Conversely, a Hamilton circuit in G may be converted to a Hamilton circuit in G' by replacing an arc v -> w by the path from v 0 to to v 2 to w 0 Thus G' has a Hamilton circuit if and only if G has a Hamilton circuit. The reduction of G to G' is easily accomplished in log-space. Thus we conclude that Lh is NP-complete. .
Integer linear
programming
Most known NP-complete problems reduction from a
are easily
shown
known NP-complete problem
example of a problem where the opposite
is
is
to be in
difficult.
the case.
It
jV'efi,
and only the
We shall now give an is
easy to prove that
The NF-hard but difficult to show it is in integer linear programming problem is: Given an m x n matrix of integers A and a column vector b of n integers, does there exist a column vector of integers x such integer linear
programming
is
13.2
Fig. 13.4
|
SOME WP-COMPLETE PROBLEMS
Graph constructed
for
Example
13.3.
337
INTRACTABLE PROBLEMS
338
that Ax > b? The reader may formalize this problem as a language in an obvious way, where the words of the language are the elements of A and b written in
binary.
Lemma
Integer linear
13.4
programming
iVP-hard.
is
Proof We reduce 3-CNF satisfiability to integer linear programming. Let Fj a F 2 a a Fq be an expression in 3-CNF, and let x u x 2 x„ be the variables of E. The matrix A will have a column for each literal x, or x 1 < i < n. We may thus view the inequality Ax > b as a set of linear inequalities among the literals. For each /, 1 < i < n, we have the inequalities
E=
•
•
•
,
.
.
•
,
f
+
Xi
>
— x, —
x,
> —
x,
1,
1,
x,
>
0,
x,
>
0,
which has the effect of saying that one of x, and clause a v a 2 v a 3 we have the inequality
x, is 0, the
other
is
,
1.
For each
,
1
ct
l
a2
-f
-f
>
a3
1,
which says that at least one literal in each clause has value 1. It is obvious that A and b can be constructed in log-space and the inequalities are all satisfied if and only if E is satisfiable. Thus linear integer programming is iVP-hard.
we may guess a vector x and To show integer linear programming is in 4 Ax > b. However, if the smallest solution has elements that are too large, we may not be able to write x in polynomial time. The difficulty is to show that the elements of x need not be too large, and for this we need some concepts .
check that
from linear algebra,
specifically
determinants of square matrices, the rank of a and Cramer's rule for solving simultaneous
matrix, linear independence of vectors,
which we expect the reader to be familiar. we assume matrix A and vector b form an instance of the integer linear programming problem and that A has m rows and n columns. Let a be the magnitude of the largest element of A or b. Note that the number of bits needed to write out A and b is at least mn + log 2 a, and we shall use this quantity as a lower bound on the input size; our nondeterministic solution finder will work in NTIME(p(mrt + log 2 a)) for some polynomial p. Further, we define a„ for 1 < i < m, to be the vector of length n consisting of the zth row of A. We let b be We the zth element of b and we let x = (x„ x 2 ., x„) be a vector of unknowns. linear equations, with all of
In what follows,
{
,
use
|
/
1
for the
magnitude of integer and det i
series of technical
Lemma 13.5 q = max (m, Proof
.
If
lemmas
£
is
is
.
B for the determinant of matrix B. A
needed.
a square submatrix of A, then
|det£|
<
(aq) 9
,
where
n).
or difference of k\ the product of k elements. Therefore, if B is a k x k
Recall that the determinant of a k x k matrix
terms, each of which
is
is
the
sum
{
13.2
submatrix, k l(x k
an upper bound on
is
SOME 7VP-C0MPLETE PROBLEMS
|
|
B
det
.
As k < k k and k !
\
lemma.
"
Zj
.
D be
have rank
we have our
rc,
then there
an integer vector z = (z u vector of all O's) and no
is
Az = 0 (0 is a magnitude, where q = max (m, n).
(
2q
in
Assume without left
<
r.t If r
z not identically zero, such that
exceeds
Proof upper
A
Let
13.6
z„),
.
g,
"
.
Lemma z2 ,
<
339
the last
the rows of
m — r rows of A As any r -f rows of A are linearly dependent, and C are linearly independent (because B has a nonzero determinant), 1
each row of
D
D = £C
some (m
for
x r submatrix of A in the C be the first r rows of A and let
loss of generality that B, the r
corner, has a nonzero determinant. Let
can be expressed as a linear combination of rows of C. That is, — r) x r matrix E. Then Az = 0 if and only if Cz = 0 and
we can make Cz = 0. If we choose Cz = 0 if and only if By = w, where y is the vector (z l5 z 2 zr ) and w is the nth column of C. By Cramer's rule, By = w is satisfied if we take z = det B /det B, where B is £ with the ith column replaced by w. By Lemma 13.5, these determinants do not exceed (aq) q in magnitude. The resulting z may not have integer components, but if we multiply all components by det £, they will be integers, and will still satisfy Az = 0. When we do so, z„ = —det B; the magnitudes of the first r components of z do not exceed
ECz = zn
0. It suffices, therefore,
— — 1,
=
and zr+1
=
z r+2
,
•
. .
show
to
•
•
=
z„_
l
=
q 2 )
It
{oiq)
2q ,
+
r
1
f
through n
—
are 0.
1
follows that the solution z can be written with a
Lemma 4- a,
A
Let
13.7
solution to t
then
£
and components
+
most the second power of mn
b
0,
. ,
f
((
that
=
Ax >
where a
is
b,
log 2 a, the size of the
be a matrix with at
x
>
then there
0,
is
least
number of
bits that
is
at
problem statement.
one nonzero element. If there is a in which for some i, b < a x <
a solution
t
4
the magnitude of the largest element of A.
Ax > b. Suppose a,x 0 > + a for all i. Adding or from some component of x 0 must reduce some product a,x 0 Furthermore, no product can decrease by more than a. Thus the new x is also a solution. The process cannot be repeated indefinitely without obtaining a solution x for which there is an i such that b < a,x < b- + a. Proof
Let x 0 be a solution to
subtracting
•
1
.
t
Theorem
13.7
Integer linear
programming
t
is
iVP-complete.
show only that the problem is in r &. We begin by guessing the signs of the x 's in some hypothetical solution and adding n constraints x < (>)0 depending on the sign guessed. Then guess a row and a < b + a such that in some solution x 0 we have constant c in the range b < a,x 0 = c Now suppose that after reordering rows if necessary, we have correctly Proof
By Lemma
13.4
we have
to
.
\
f
i
i
c-,
x
x
,
i
.
t
t Recall that the rank of r
rows, the trix
is
maximum number
equivalently defined as the
maximum number
of linearly independent
of linearly independent columns, or the size of the largest square
with a nonzero determinant.
subma-
INTRACTABLE PROBLEMS
340
guessed c v c 2 1) bi
2)
<
Ci
Ax >
<
c k such that
,
bi
+
(aq)
2q+
\ and
b has a nonnegative integer solution
if
and only
a,x
=
ch
1
< < i
k,
a,x
>
b^
k
< <
n%
and
let
i
if
has such a solution.
Let
A k be
case
the
first
k rows of A,
The rank of A k
1
less
is
than
n.
c be the vector (c ly c 2
By Lemma
13.6 there
is
c k ).
,
an integer vector
z,
none of whose components has a magnitude greater than (ocq) 2q such that A k z = 0. Therefore, if A k \ Q = c, it follows that A k (\ 0 + dz) = c for any integer d. If 2q+ it is also true that a,x 0 > b + (ctq) for all i > k, then we may repeatedly add or * subtract 1 from d until for some j > k, Zj(x 0 + dz) drops below b + ( )0] that is all zero except for a one in that component, must have f > k. Thus some a^fxo 4- dz) for j > k must eventually drop below bj + (a^) 2?+ *• Since each component of z is bounded in magnitude by (aq) 2q changing d by 1 cannot * change any a 7 (x 0 + dz) by more than ctn(aq) 2q which is no more than (ocq) 2q Therefore a ; (x 0 + dz) > b } By reordering rows, we may assume j = k + 1 and repeat the above process for k + in place of k. z
0,
,
1
t
1
.
>
1
.
,
.
1
is n. In this case, there is a unique x satisfying A k x = c. By The rank of rule, the components of x are ratios of two determinants whose magni-
case 2
Cramer's
tudes do not exceed
+
((xq)
2q+
l
)a q ~
\ which
is
check whether this x consists only of integers and
less
than (2aq) 3q+
satisfies a^x
The nondeterministic process of guessing c/s repeats any sequence of choices requires a number of arithmetic
at
>
1 .
We may
bj for j
most n
steps that
times, is
>
k.
and
for
polynomial
4 in q [since Cramer's rule can be applied in 0(r ) arithmetic steps tor xr matrices] applied to integers whose length in binary is polynomial in aq. The arithmetic
steps that are multiplication or division of integers can be performed in time
proportional to the square of the length of the integers
and subtraction can be performed time that
is
polynomial
in
in linear time.
Thus
in
binaryf and addition
the entire process takes
the input length, since that length
is
at least
mn +
log 2 a.
Other /VP-complete problems There
is
a wide variety of other
known iVP-complete problems.
We shall list some
of them here. t Actually in considerably less time (see
importance here.
Aho, Hopcroft, and Ullman
[1974]), although this
is
of
no
13.3
G and
The Chromatic Number Problem. Given a graph
1)
THE CLASS co-„V&
|
an integer
k,
341
can
G
be
colored with k colors so that no two adjacent vertices are the same color?
The Traveling Salesman Problem. Given a complete graph with weights on the what is the Hamilton circuit of minimum weight? To express this problem as a language, we require the weights to be integers and ask whether there is a Hamilton circuit of weight k or less. This problem is iVP-complete even if we restrict the weights to 0 and 1, when it becomes exactly the Hamil-
2)
edges,
ton circuit problem.
The Exact Cover Problem. Given a
3)
subsets of some set U,
4)
is
pair of sets in the subcollection
is
The
list
Partition Problem.
subset whose
sum
appears to be
in
+
*i
i*2
+
+
some other
Among
l
is
&
Given a
until
k->
.
,
..,
Sk
,
all
being
U such that each
disjoint?
exactly \(i x
DUt tne
S l9 S 2 whose union is
collection of sets
there a subcollection
of integers i u i + •• + i
+
i
we remember that the sum of the lengths of
.
2,
.
.,
i
k,
Note
k ).
2
does there
that this
exist
a
problem
length of an instance
is
not
the i-s written in binary or
fixed base.
NP-complete problems are many, including the ones mentioned which serious effort has been expended on finding polynomialtime algorithms. Since either all or none of the iVP-complete problems are in tP, it is natural to conjecture that none and so far none have been found to be in are in More importantly, if one is faced with an iVP-complete problem to solve, it is questionable whether one should even bother to look for a polynomial-time algorithm. We believe one is much better off looking for heuristics that work well on the particular kinds of instances that one is likely to encounter. the
in this section, for
Extended significance of /VP-completeness
We
have inadvertently implied that the only issue regarding NP-complete problems was whether they required polynomial or exponential time. In fact, the true
answer could be between these extremes; is in,
say
DTIME(Mlog "),
for
some constant
for
.
S
yj>,
and
L
c.
were
example, they could require n
,ogn
are log-space or even polynomial-time reducible to L,
time. If all languages in
and L
for
then every language
In general, in
if
L were
in
.
\
is
in
DTIME(rc clogn )
log-space or polynomial-time complete
DTIME(T(n)), then
.i'^c
(J
DTIME(T(/i c )).
c>0
THE CLASS
13.3 It is
co-
unknown whether
turn out that jV* since
&
is
&
is
I
P
the class
.
t
'i? is
closed under complementation. Should
not closed under complementation, then clearly
closed under complementation. There
is
it
& ± J'eP,
no NP-complete problem
INTRACTABLE PROBLEMS
342
whose complement is known to be in For example, to determine nonsatisfiability for a Boolean formula with n variables, it appears necessary to test every one of the 2" possible assignments, even if the algorithm is nondeterministic. In fact if any NP-complete problem is discovered to have its complement in Jf0>, then Jf& would be closed under complementation, as we show in the next theorem.
Theorem 13.8 Jf& is closed under complementation ment of some NP-complete problem is in Jf&.
The "only if
Proof
part
is
obvious. For the "if" part
We guages
L
is
in
if
the comple-
5 be an NP-complete
let
Jf&. Since each L in Jf& log-space reducible to S. Thus L is in Jf&.
problem, and suppose S were to S, each
and only
if
is
log-space reducible
complements of the lanand PSPACE is shown certain that any of the regions except
shall define the class co-J^g? to be the set of
in
The
Jf0>.
in Fig. 13.5,
relationship between
although
the one labeled
e/>
it is
not
known
^\ Jf0>,
for
zo>-Jf0>
are nonempty.
Fig. 13.5
Relations
among some language
classes.
The problem of primality problem in A'i? such as "nonprimeness" for which no known polynomial time algorithmf and furthermore which is not
interesting to consider a
It is
there
is
known
to be NP-complete.J
To
test
an integer to see
if it
not a prime, one
is
simply guesses a divisor and checks. The interesting observation
plementary problem
is
in
Jf&, which
suggests that there
is
may
that the
be sets
com-
in the
that are not in J/'& and co-yK consider a nondeterministic polynomial-time algorithm for testing
intersection of
We now
whether an integer
Lemma 1)
t %
x
+
13.8
is
prime.
Let x and y be integers, with 0
y (mod p) can be computed
in
<
x,
y
<
p.
Then
time 0(log p);
Although Miller [1976] presents strong evidence that one exists. until one remembers that the is another problem that appears to be in
&
This
not p
itself.
size of input
p
is
log 2
P>
13.4
PSPACE-COMPLETE PROBLEMS
|
343
2
2)
xy (mod
p) can be
computed
in
time 0(log p);
3)
x y (mod
p) can be
computed
in
time 0(log 3 p).
(1) and (2) are obvious since an integer mod p requires only log p bits. For compute x y by repeated squaring to get x 2 x 4 x 8 x 2 mod p, where [log 2 y J, then multiply the appropriate powers of x to get x y
Proof (3) i
=
'
,
.
We 1)
2)
x
x
~ p
what
shall, in
and only
if
,
,
if
=
1
1
Theorem
use of Fermat's theorem
x of order p
mod p, and mod p, for 1 < <
l
make
follows,
there exists an
—
1,
that
p
>
x,
1
:
some
for
is,
2
is
a prime
p,
1
i
The
13.9
set
-
p
of primes
1.
in JfgP.
is
Proof If x = 2, then x is prime. If x = 1 or x is an even integer greater than then x is not prime. To determine if p is prime for odd p greater than 2, guess an 0 < x < p, and verify that 1)
2)
xp x
~
=
1
1 mod p, and mod p for all
l
1
z,
Condition for
each
tion of p
(1) is easily
1
< < i
p
— is
1.
-
1.
3
checked
in 0(log p) steps.
directly since there are too
i
each pj
2,
x,
many
—
1
cannot check condition
(2)
—
1
is
the product of the p/s. Finally verify
Let the factorization be p
a prime. Verify that p
We
f s. Instead, guess the prime factoriza-
=
p x p2
*
*
p k Recursively verify that .
'
1 = 1 mod p, then the least i satisfying x (p-i)/pj | mo(j p observe that if x x = 1 mod p must divide p — 1. Furthermore, any multiple of this i, say ai, must ai also satisfy x = 1 mod p. Thus, if there is an such that x = 1 mod p, then for ip ~ 1)1 pj = some p^ x 1 mod p. Assume that the nondeterministic time to recognize that p is prime is bounded by c log 4 p. Then we need only observe that 17
1
1
/
k
k
X for
13.4
some
4
c log py
+ X
c i lo g
sufficiently large constant
3
A+
c i Io
sV ^
4
c lQ g p
c.
PSPACE-COMPLETE PROBLEMS
We now
show
several problems to be complete for
PSPACE
with respect to
polynomial time.
Quantified Boolean formulas
Quantified Boolean formulas
and
—
i,
(QBF) are built from
variables, the operators
parentheses, and the quantifiers 3 ("there exists") and V ("for
all").
a v ,
,
When
INTRACTABLE PROBLEMS
344
QBFs
defining the
we
recursively,
find
it
useful simultaneously to define free
no
occurrences of variables (occurrences to which
quantifier applies), bound
occurrences of variables (occurrences to which a quantifier applies), and the scope of a quantifier (those occurrences to which the quantifier applies). 1) If
x
2) If
E and £ 2
a variable, then
is
x
rence of x
bound 3) If
£
is
all free
in
a
is
E
x
QBF,
QBF. The
a
E 2 Redundant .
is free.
a (£ 2 ), and (£
v (E 2 ). An occur-
x )
is
free or
parentheses can be omitted.
QBFs. The scopes of 3x and Vx are may also be bound occurrences scope.) Free occurrences of x in £ are occurrences of variables in £ are free or
then 3x(£) and Vx(£) are
occurrences of x in £. (Note that there
of x in £; these are not part of the
bound
occurrence of x
so are -i(£i), (E t )
or bound, depending on whether the occurrence
free
or
it is
QBFs,
are
in
3x(£) and Vx(£). All other
bound, depending on whether they are
A QBF
free
or
bound
in £.
with no free variable has a value of either true or
denote by the Boolean constants
1
and
0.
The value
of such a
false,
QBF
is
which we
determined
by replacing each subexpression of the form 3x(£) by £ 0 v E and each subexpression of the form Vx(£) by £ 0 a E u where £ 0 and £ t are £ with all occurrences of x x
in the
scope of the quantifier replaced by 0 and
to determine
Example first
whether a
QBF
Vx [Vx[3y(x v
13.4
1,
respectively.
The
with no free variables has value
y)]
a
—ix] is a QBF. The scope of the
occurrence of x; the scope of the outer
Vx
is
QBF problem is
true.
inner
the second occurrence.
Vx
To
the
is
test the
above formula, we must check that Vx[3y(x v yj] a~ix is true when is, the second occurrence only) are set to 0 and also when set to 1. The first clause Vx(3y(x v y)) is seen to be true, as when this x is 0 or 1 we may choose y = 1 to make x v y true. However, ~ix is not made true when x = 1, so the entire expression is false. truth of the free
occurrences of x (that
Note a Boolean expression £ with variables x x 2 xk is satisfiable if and if the QBF 3x 3x 2 " 3x k (Ek ) is true. Thus the satisfiability problem is a special case of the problem of whether a QBF is true, which immediately tells us r 9 that the QBF problem is NP-hard. It does not appear that QBF is in x ,
only
,
.
.
.
,
m
x
.
\
however.
PSPACE-completeness of
Lemma Proof
QBF
13.9
A
QBF
is
in
the
QBF
problem
PSPACE.
simple recursive procedure
with no free variables. In
fact,
EVAL can be used to compute the value of a EVAL will handle a slightly more general
1 have been substituted for some Boolean constant, EVAL returns that constant.
problem, where the Boolean constants 0 and variables. If the
QBF
consists of a
13.4
If the
QBF
PSPACE-COMPLETE PROBLEMS
|
Boolean operator applied to subformula(s), then
consists of a
345
EVAL
evaluates the subformulas recursively and then applies the operator to the
QBF
result(s). If the
occurrences of x in
E0
then evaluates
E u and
obtain
OR
E
EVAL
of the form 3x(E) or Vx(£), then
is
replaces all
by 0 to obtain E 0 and replaces the occurrences of x by 1 to
that are in the scope of the quantifier
E
EVAL
Next
recursively.
evaluates
recursively. In the case of 3x(£),
x
of the two results. In the case of Vx(£),
EVAL
EVAL
returns the
returns the
AND.
is at most n for a QBF of Using a Turing tape for the stack of activation records (as in Theorem 12.11), we see that the tape need never grow longer than the square of the length of the original QBF. Thus the QBF problem
number of operators
Since the
length
is
in
n,
the depth of recursion
plus quantifiers
at
is
most
n.
PSPACE.
Theorem
QBF
The problem of deciding whether a
13.10
is
true
PSPACE
is
complete.
By Lemma 13.9, we need show only that the language of coded true PSPACE-hard. That is, we must show that every language in PSPACE is
Proof
QBF's
is
polynomial-time reducible to
L^.
M be a one-tape polynomial space-bounded DTM accepting a language M makes no more than c moves L. Then for some constant c and polynomial Let
p(n)
p,
M
as in Theorem 13.1, using the on inputs of length n. We can code ID's of Boolean variables c lX 1 < i < p(n), and X a tape symbol or a composite symbol representing a symbol and the state of M. Since is deterministic, there is no need to code a choice of moves in the composite symbol. Our goal is to construct for each 0
M
and
1) /j
are each distinct sets of variables, one for each
12
i,
1
< <
each tape symbol or composite symbol X, analogous to the c ix 13.1.
i
s
of
p(n),
and
Theorem
Say
=
{
c ix
\
X
1
^ < *
p(n)
and
1
< <
p(n)
and Y
is
such a symbol},
and 12
2) Fj(I u I 2 )
true
is
= Kt
if
1
and only
i
if
Ix
and
exactly one c ix and d lX
and
X p{n) most 2 j moves, where p = X X 2 are the symbols such that cix and d iYt are
true.
each
i,
is -
•
x
x
,
.
Then given x of
Qx = where
3/ 0
and
length n
we may
3I 0 3I f [F
mc
write a
(Io, //)
such a symbol}.
I 2 represent ID's true,
for
for
is
/? 2
f$
x
= Y Y2 x
M
/? x
^-
fi 2 •
•
that is, and p 2 °f by a sequence of at Yp{n)y and X and Y y
{
{
QBF
a INITIAL(/ 0 ) a FINAL(/ / )],
31 f stand for a collection of existentially quantified variables,
each symbol
X and
integer
i,
1
< < i
p(n\ as above. INITIAL(/ 0 )
is
one
a proposi-
tional formula that says the variables in the set / 0 represent the initial
ID
of
M
INTRACTABLE PROBLEMS
346
FINAL(/ / )
with input x, and of
M. Then Q x
true
is
written in time that
We now
is
if
expresses the fact that I f represents an accepting
and only
if
polynomial
show how
in
x
is
n using the techniques of
to construct, for each
j,
= 0, is easy. Using the technique of Theorem Boolean formula the facts that x
and
12
represent ID's, say
position in 2) Either
=
fi x
and
/?j
fi 2
or
For the induction
/? 2
;
Theorem 13.1. The basis,
we have only
that
is,
to express as a
exactly one variable for each
true.
/? 2 is
p |— p 2 1
we
step,
and
fi t
ID
can be
the formula Fj(I l9 I 2 ). 13.1,
j
1) l
FINAL
L(M). INITIAL and
in
are tempted to write
=
Fj{I u / 2 )
.
a F,_,(/,
(3I)[Fj_ ,(/„/)
/ 2 )].
Fy _ l9 and the length of and therefore cannot be written in polynomial time. Instead we use a trick that enables us to make two uses of an expression like Fj_ in only a small amount (polynomial in n) more space than is required for one use. The trick is to express that there exist J and K such that if J = I and K = I or J = I and K = / 2 then F} _ ^J, K) must be true. The QBF for this is However,
F p( „ )log2C
if
we do
will
be
so, F,
has roughly double the length of
at least c
p(n
\
j
x
,
Fj(I !, / 2 )
=
3I[3J[3K[(^(J
a-i
We
and
both are true or both are is
=
use expressions like J
variables in the sets J
l
x
false.
Intuitively,
F} _
l
I
I
= / aK = to
t
mean
I2
))vFj .
Equation
F} _
and Fj_
x
(l,
t
(J,
I)
l
(J y K)]]].
(13.3)
that for each pair of corresponding
same position and symbol), whenever the pair (J, K)
(13.3) states that
K) must be
l 2 ) are true
used as a "subroutine" that
is
aK =
1
(those representing the
either (/„ /) or (I l9 / 2 ), then
that both Fj.^Ii, I)
(J
=
is
true.
This allows us to assert
using only one copy of Fj-
t
.
"called" twice.
The number of symbols in Fj9 counting any variable as one, is 0(p(n)) plus the number of symbols in Fy_ v Since (13.3) introduces 0(p(n)) variables (in the sets /, J, and K), the number of variables in F, is 0(jp(n)). Thus we can code a variable with 0(logy"
4-
log p(n)) bits.
time 0(jp(n) (log j
polynomial time.
Thus
+
p(n), log p(n)
there
is
It
follows by induction on j that Fj can be written in we let ; = p(n) log c and observe that for any
log p(n))). If
=
0(log
n),
we
see that
Q x can
TM, we
2 0(p (n) log n) is an Since
in
L(M) to Lq bf have shown that Z^ bf
a polynomial time reduction of
arbitrary polynomial space-bounded
be written
.
M
is
PSPACE-
complete. Context-sensitive recognition is: Given a CSG G and a w in L(G)? This result is surprising, since the CSL's occupy the 2 "bottom" of PSPACE, being exactly NSPACE(n) and contained in DSPACE(n
Another PSPACE-complete problem worth noting
string w,
is
).
13.5
COMPLETE PROBLEMS FOR
|
However, the "padding" technique used
AND NSPACE(log
0>
in the translation
tl)
lemma (Lemma
347
12.2)
makes a proof possible.
To begin, pick a straightforward binary code for grammars as we have done Turing machines. Let Lcs be the language consisting of all strings x#w, where x the code for a CSG G x and w is a coded string from the input alphabet of G x
for is
.
Assume that for a given grammar, all grammar symbols are coded by strings of the same length. It is easy to design an LB A that, given input x#w, guesses a derivation in G x such that no sentential form exceeds the length of the string coded by w. The coded sentential form can be stored on a second track under the cells holding w. Moves are determined by consulting the x portion of the input (to see how this may be done it helps to assume the existence of a second tape). We see that Lcs is in
NSPACE(n) and Theorem
13.11
We
thus in
Lcs
,
the
PSPACE.
CSL
recognition problem,
is
PSPACE-complete.
know Lcs to be in PSPACE. Let L be an arbitrary member of L is accepted by M, a DTM of space complexity p(n). Define L to be p(iy|) {y$ |y is in L}, where $ is a new symbol. It is easy to check that L is in DSPACE(n) and therefore is a CSL. Let G be a CSG for £, and let x be the binary encoding of G. Then the polynomial-time mapping that takes y to x#w, where w is the encoding of y$ pi ^\ is a reduction of L to Lcs showing Lcs is PSPACEProof
PSPACE;
already
say
,
complete.
COMPLETE PROBLEMS FOR & AND NSPACE(LOG
13.5 It
n)
obvious that DSPACE(log n) c & by Theorem 12.10. Could it be that DSPACE(Iog n), or perhaps & c DSPACE(log k n) for some k'l Similarly, it obvious that DSPACE(log n) c NSPACE(log n). Could these two classes be is
&= is
equal?
If so,
then by a translation analogous to
NSPACE(h) c DSPACE(h),
that
is,
deterministic
Lemma
12.2,
it
follows that
and nondeterministic CSL's are
the same.
We
shall exhibit a
language
L
x
in
& such that every language in
\
J is f
log-space
DSPACE(logk n) for some then & is contained in DSPACE(log* n). Similarly we exhibit an L 2 in NSPACE(log n) such that every language in NSPACE(log n) is log-space reducible to L 2 Should L 2 be in DSPACE(log n), then DSPACE(log n) would equal NSPACE(Iog n). There is, of course, no known way to recognize L in log* n space and no known way to recognize L 2 deterministically in log n space. Languages complete for NSPACE(log n) or for & are not necessarily hard to recognize, and in fact, the languages L and L 2 are relatively easy. The results of this section serve merely to reinforce the idea that many complexity classes have complete problems. They do not suggest intractability the way MP-completeness reducible to
L v Should
this
language be
in
/c,
.
x
x
or PSPACE-completeness results do.
INTRACTABLE PROBLEMS
348
Context-free emptiness
Define
L cfe
to be the language of
L
the language
L cfe
coded CFG's whose languages are empty.
alluded to above.
i
We
shall
show
that
&
is
L cfe
is
log-space reducible to
.
Theorem
Lcfe
13.12
the emptiness
,
problem
CFG's,
for
is
complete for
& with
respect to log-space reductions.
We
Proof
shall reduce
we
space. Specifically
M
L
an arbitrary language
in
^
to Lt fe using only log n
shall design a log-space transducer
M
x
.
Given input x of
0
if and only if x is in L. Let L(G X ) = be a p(n) time-bounded TM accepting the complement of L. Since
length n
y
grammar G x such
writes a
t
M
that
M
A xin where
symbols of the form
X
1 )
a tape symbol of M, a pair [qY\ where q is a state and Y a tape symbol, or marker symbol # used to denote the ends of ID's;
is
the
0
2)
3) 0
The the
<
t
<
p(n).
intention
ID
1;
is
that
^> w for some string w if and only if X is the The symbol S is also a nonterminal of G x
A Xit
of A? at time
t.
;
ith it
symbol of
is
the start
symbol.
The productions of G x 1)
S
A [qfY]it
->
for all
/,
are:
and
r,
7,
where qf
is
a final
state.
Z) be the symbol in position i of the tih ID whenever XYZ occupies positions i — 1, /, and i 4- 1 of the (f — l)th ID. Since is deterministic,/^, V, Z) is a unique symbol and is independent oft and t. Thus for each and r, 1 < i, r < and for each triple AT, y, Z with jy y, Z),
2) Let
f(X,
7,
M
i
we have
the production
A Wit ~* A X j 3)
A* 0t
->
4) ^xio
input x
^
and
^
for is
i mt
Uf -» £ for all
< < i
1
p(n)
if
-
i
^Y,i,t-
1
^Z.i+
l.f-
1-
r.
and only
if
the ith symbol of the initial
ID
with
X.
W
< p(n), A Wit ^> e if and only if t shows that for 1 < symbol of the ID at time f. Of course, no terminal string but e is ever derived from any nonterminal. Any
is
easy induction on
i
the ith
Basis
The
y,
t
=
0, is
immediate from rule
(4).
W
is If A wit £, then by rule (2) it must be that for some X, Y, and Z, Z) and each of A xi . l ^ i A Yi4 - u and A ZJ+Ut -i derive e. By the
Induction
f(X,
basis,
^>
,
13.5
& AND
COMPLETE PROBLEMS FOR
|
NSPACE(log
349
n)
ID at time t - 1 in positions i — 1, i, and symbol at position i and time t by the definition
inductive hypothesis the symbols in the
+
i
are X, 7,
1
and Z, so
W
is
the
of/ Conversely,
if
Z),
where X,
By
the inductive
W
the symbol at position
is
and time
i
Z are the symbols at time hypothesis, or by rule (3) if = 0
and
7,
or
i
Ax,i- l,t-l ^Y,i,t-
1
4^
^Z,i+
t
>
l.t-
=
i
^
i
W =f(X, Y,
then
1,
in positions
1
t
—
i
1,
i,
and
i
+
1.
4- 1,
£-
^> e. Thus by rule (2), Then by rule (1), S ^> £ if and only if accepts x. Finally we need show that the productions of G x can be produced by with input x of length n. First of all, recall that log 2 p(n) < c log 2 n for some constant c, since p(n)
M
productions of
Now Gx is is
The
Gx in
Thus Lcfe
in L.
M
a polynomial. Therefore
is
scratch storage. Similarly
M
M
can count from
x
can count from
i
=
t
i
is
if
and only
complete
if
in log n
The
and
i
M does not accept x and hence
^ with
for
0 to
in log h space.
are easily generated by a double loop on
Lcfe
=
0 to
x
and only
if
if
x
respect to log-space reductions.
reachability problem
Now we shall give
a problem that
log-space reductions.
with vertices
Theorem
{1, 2,
n}
complete
NSPACE(log
for
reachability problem
determine
The graph
13.13
NSPACE(log
is
The graph
if
there
reachability
is
is,
a path from
problem
n) with respect to
given a directed graph 1
to
n.
complete
log-space
is
for
n) with respect to log-space reductions.
The formalization of this problem as a language is left to the reader. First we show that the graph reachability problem is in NSPACE(log n). A nondeterProof
ministic
TM M can guess the path vertex by vertex. M does not store the path, but
instead verifies the path, storing only the vertex currently reached.
Now,
given a language
L
NSPACE(log
in
n)
we reduce
it
in log
n space
deterministically to the language of encoded digraphs for which a path from the first
vertex to the last exists. Let
offline
TM
accepting L.
An ID
M be a log n space-bounded nondeterministic M can be represented by the storage tape
of
contents, which takes log n space to represent, the storage tape head position state,
which
may
and the input head position, which requires log n
We
and
be coded with the storage contents via a composite symbol [qX\
M
bits.
x and produces a and only if accepts x. with input x (but with the input head position, The vertices of G x are the ID's of rather than with x itself) plus a special vertex, the last one, which represents digraph
construct a log-space transducer
Gx
acceptance.
with a path from the
first
x
that takes input
to the last vertex
M
if
M
The
first
vertex
is
ID with M. For each ID
the initial
to cycle through all the ID's of
input /,
M
x
x.M,
uses
positions
its
its
log n storage
input head at
INTRACTABLE PROBLEMS
350
J
M
accepting ID, It is if
/,
it
for all the finite
one move of M. Since constructed from
M
can see the input symbol scanned by M. x then number of J's such that / can become J by has / available on its storage tape, and J can be easily generation requires no more than log n space. If / is an
the correct input position, so
generates arcs /
M
this
t
generates the arc /
x
u,
and only
M accepts
if
x.
complete
for
NSPACE(log
v
is
the special vertex.
a path in
is
Thus each language
in
G x from the initial ID to v
NSPACE(log
We conclude that
reducible to the reachability problem. is
where
straightforward to check that there
n)
is
log-space
the reachability problem
n) with respect to log-space reductions.
SOME PROVABLY INTRACTABLE PROBLEMS
13.6
Up
to
now we have
strongly implied that certain problems require exponential
time by proving them NP-complete or PSPACE-complete.
We
shall
now
prove
two problems actually require exponential time. In one case, we reduce to our problem a language which, by the space hierarchy theorem, is known to require that
exponential space and hence exponential time. In the second case,
reduce to our problem
all
languages
in
we show how to
nondeterministic exponential time and
then argue by a nondeterministic time hierarchy theorem [Cook 1973a] that
among them
there
must be one that really requires, say, 2" space. a problem about regular expressions that
We shall now consider
contrived so that (a) at least 2
c " /logrt
space
is
required to solve
it
is somewhat and (b) this
we consider a problem in logic that had been considered long before its complexity was analyzed, and where proof of exponentiality is far from straightforward. requirement can be readily proved. After that, is
not contrived
in that
it
Regular expressions with exponentiation Let us consider regular expressions over an alphabet assumed for convenience not to contain the times),
where
symbols i
|, 0,
or
1.
Let r
written in binary.
is
For example,
tiation) operator.
f
i
r(i stand for the regular expression rr r may include the ] (exponen•
•
The expression
(a |
1 1
+
b |
1
1) |
10 stands for
{aaaaaa, aaabbb, bbbaaa, bbbbbb}.
We
assume
shall
]
has higher precedence than the other operators.
show requiring
essentially exponential space, that
is,
2
The problem we space for some
p(n)
whether a regular expression with exponentiation denotes all not |, 0, and 1 are used as operators and are part of the alphabet). First we give an exponential-space algorithm for the problem. polynomial
strings over
Theorem denotes
p(n), its
13.14
is
alphabet (remember
The problem whether a
all strings
over
its
regular expression with exponentiation
alphabet can be solved
in
exponential space.
13.6
SOME PROVABLY INTRACTABLE PROBLEMS
|
351
Given a regular expression of length n, we shall expand the f's to obtain an ordinary regular expression and show that it has length at most n2 n Then we shall convert this expression to an NFA of at most n2 n+2 states and test whether that NFA accepts £*. (Note that this latter step must be done without conversion to a Proof
.
DFA,
since the
inside out.
We
DFA might have 2 n2n+2 states). To eliminate the f s we work from prove by induction on j that an expression with ]\ having length l's, has an equivalent ordinary regular expression of length at
m, with j O's and
most m2 j Basis
.
=
j
The
0.
result
is
immediate.
Scan the expression
Induction
r
m from the left until the first | is argument r t of that | is found. We argument must be a single symbol or be
of length
encountered. Then scan back until the
left
assume t has highest precedence, so its surrounded by parentheses; hence this extraction is easy. Let the expression be r r 3 where r x is written i times. By the r = r 2 r t ] ir 3 Replace r by r' = r 2 r 1 r x inductive hypothesis, r' has an equivalent ordinary regular expression of length at ~ < m, most (m + (i - l)\r 1 \)2 j log2i symbols. Since 2 y_,082 = 2 j/i, and since \r • •
•
.
,
x
i
l
we
\
see that
(m
If r
is
+
-1)1^1 )2>-
(i
of length
n,
,og2
'
m+
=
m=
then surely
regular expression has length at most
and
n
HhJ
(*~
j
<
x 2j
< m2K
so the equivalent ordinary
n,
n
ri2
.
using the algorithm of Theorem 2.3, we can produce an equivalent NFA n = n2 n+2 states. Nondeterministically guess symbol by symbol an input a t a 2 that the does not accept. Using n2 n+2 cells we can, after each
Now,
of at most 4n2
NFA
guess,
compute the
symbols guessed so
set of states entered after the
far.
The input need not be no accepting
state of the
By
original expression does not denote £*. 3
that can be stored in 0(n 2
n )
cells,
and the input alphabet no upper bound on the required space. is
We
now
shall
larger than
NFA
is
.
It is
If
we
entered,
Savitch's theorem
2 n
process deterministically using space n 4
NFA
reads the sequence of
down, since we can compute
on any input symbol.
the set of states entered from this set
input sequence on which
NFA
written
ever guess an
we
accept; the
we may perform
this
easy to devise an encoding of the
since about n bits suffice to code a state,
n.
As n 2 4"
provide a lower bound of 2
>
n
cn/log "
3
2",
for
it
2 follows that n 4"
some constant
c
is
an
on the
space required for the above problem. Observe that proving a certain amount of space is required also proves that the same amount of time is required (although the opposite
is
Theorem
13.15
language
2^. ex
not true).
There
is
a constant
c>
0 such that every
TM
accepting the
of regular expressions with exponentiation that denote
more than 2 cn/logn space (and
therefore 2
cn,logn
time) infinitely often.
E*
takes
1
INTRACTABLE PROBLEMS
352
Consider an arbitrary 2" space bounded single-tape deterministic TM M. For each input x of length n, we construct a regular expression with exponentiation Ex that denotes Z*, where Z is the alphabet of E x if and only if does not accept x. We do so by making E x denote all invalid computations of on x. Let Z consist of all tape symbols of M, the composite symbols [qX\ where q is a state and X a tape symbol, and the marker symbol #. Assume that |, 0, and 1 are none
Proof
M M
,
of these symbols.
A or
string y in
more
1)
The
initial
2) There 3)
Z*
One ID
is
ID
is
Z —
and only
if
one
does not follow from the previous by a
sum
the
state.
we
move
M.
of
use sets of symbols to represent the regular expression
of those symbols. Thus,
if
Z = {a u a 2 + a 2 + "' + ,
a to stand for the regular expression that
A is
if
wrong.
shorthand for the regular expression a t except
M on x
not an accepting computation of
no accepting
is
In what follows, that
is
of the following are true.
is
the
.
.
a n } then
• ,
an
sum
y
we use Z as a we also use
Similarly
.
of
all
the symbols in
Z
a.
regular expression denoting
strings that
all
do not begin with
the initial
ID
given by
START =
c
+
(Z
- #)Z*
+Z
T (n
+Z
T
+
A + A2 +
4-
1)(Z
••
t
+
c) T (2"
-
n
4-
-
A„
-
1)(Z
£)Z*
(2"+ 1)(Z- #)Z*,
where
/J,=ET and
for 2
1(E
-
[ 9o
a,])X*,
<
=E
A,
t <(£
-
a,)!*
next-to-last term denotes Z" followed by up to 2" — n — 1 symbols followed by anything but a blank, and denotes strings such that some position between n + 1 and 2" of the first ID does not contain a blank. Since n and 2" — n — 1 are written in binary, the length of this term is proportional to n. The last term denotes strings in which the (2" 4- l)th symbol is not #. It is also of length proportional to n. The remaining terms are proportional to log n in length, and +
The
there are n
n log
n.
+
3 such terms.
Thus
the length of the expression
Curiously, the length of the expression denoting false
ates the length of the other terms in
A
Ex
regular expression enforcing the condition that there
This expression
is
(Z
-
proportional to
IDs domin-
.
is
no accepting
given by
FINISH =
is
initial
{[qX] \q isa final state})*.
of constant length depending only on
M.
state
is
13.6
Then
—
i
1,
j,
and
+
i
MOVE
is,
Z* WXYZT
the sum, over the finite
is
wrong next symbol
that has a
linear in
The
n,
the length of
-
n
(2
if
W, X, and Y are
be at position
1)(Z
i
at
of the next ID.
-f(W, X,
linear in
is
ID
consecutive positions in an
2" positions to the right.
MOVE
Y)Z*.
number of triples (W, X, Y) of symbols
Y occupying
those strings with W, X, and
in Z, of
is
such that
will
353
let
MOVE = + That
Z
Z) denote the symbol 1 of one ID, then Z
Finally, let f(X, Y,
positions
SOME PROVABLY INTRACTABLE PROBLEMS
|
As
the length of each term
n.
Ex = START + FINISH + MOVE. If M accepts x, computation is not in E x If some string y is not in E x then it
desired expression
then the accepting
is
.
,
""
must begin #[^ 0 fl i] fl 2 a„£ 2 n #, each ID must follow the previous by one move of M, and acceptance must occur somewhere along the way. Thus accepts x. Therefore E x = Z* if and only if does not accept x. Now, let be a Turing machine accepting language L that can be accepted in 2" space but not in 2"/n space. The hierarchy theorem for space assures us that such an exists. Suppose there were an S(n) space-bounded TM accepting the set
M
M
M
M
L rex L rex 1 )
of regular expressions with exponentiation denoting Z*, suitably coded so
has a
finite
From x
E x whose
as follows.
«,
construct
Ex
in
space proportional to n log n
Code E x
into the alphabet of
length of the coded
Ex
3) In S(cn log n) space,
is
length
>
L rex As .
cn log n for
M
has a
is
in
finite
some constant
Ex
determine whether
is
in
n.
We
number of symbols,
the
proportional to n log
of length
can construct 2)
Then we could recognize L
alphabet.
an obvious way.
c.
L rex
.
If so, reject x;
if
not,
accept x.
The
TM
total
using
amount
less
of space
n log n else
d
>
L
is
the
maximum of n log n and S(cn log L exists, it must be that
n).
As no
n than 2 /n space and accepting
+
>
S(cn log n)
n
2 /n
(13.4)
i.o.,
n
could be accepted in 2 /n space by Lemma 12.3. There exists a constant dm/,ogm for all but a finite set of m, then (13.4) if S(m) were less than 2
0 such that
would be
number
false. It
follows that S(m)
>
2
dm/Iogm
for
some constant d and an
infinite
of m's.
Corollary
Z^ ex
is
complete
for exponential
space with respect to polynomial-time
reduction.
Proof
In
Theorem
for every language
reduce any language
We
13.15,
L in
we gave a polynomial time reduction
DSPACE(2 n DSPACE(2 p(n)
in
).
We
),
for
to L, cx that
works
could easily have generalized
polynomial
p,
to
it
to
L^.
should observe that the n log n bound on the length of E x is critical for 13.15, although for its corollary we could have allowed the length to be
Theorem
INTRACTABLE PROBLEMS
354
any polynomial in |x|. If, for example, we could only prove that \E X < |x| 2 then our lower bound on the space required by L, ex would have been 2 d% ""instead. \
Complexity of
Now we
,
first-order theories
problem that requires at least 2 cn time, nondeterand is known to be solvable in exponential space and doubly exponential time. As the problem can also be shown nondeterministic exponential timehard with respect to polynomial time reductions, proving a better lower bound regarding the amount of nondeterministic time would improve on Theorem 12.10, which is most unlikely. shall consider a
ministically,
A
first-order language consists of a
integers), a set of operations (for
example,
domain
(for
example, the nonnegative
a set of predicates (for example, =,
-f, *)
< ), a set of constants chosen from the domain, and a set of axioms defining the meaning of the operators and predicates. For each theory we can define the language of true expressions over the constants, operators, predicates, variables, the logical connectives,
Example
known
13.5
(N,
a
v and ,
,
+,*,=,<, 0,
number
—
1),
1,
and the
where
and
quantifiers 3
V.
N stands for the nonnegative integers,
GodeFs famous incompleteness theorem states that the language of true statements in number theory is undecidable. While Godel's result predated Turing machines, it is not hard to show his result. If a TM accepts when started on blank tape, it does so by a computation in which no ID is longer than some constant m. We may treat an integer i, in binary, as a computais
as
theory.
M
tion of
M with ID's of length m. that M accepts
The statement
which
c,
is
known
to be undecidable, can be
expressed as 3/3m(£ m (i)), where Em is a predicate that is true binary encoding of a computation leading to acceptance of
if c
and only with no
than m. (Some of the details are provided in Exercise 13.37.) Thus, is
if
i
ID
is
the
longer
number theory
an undecidable theory.
There are a number of decidable theories known. For example, (R, +, =, <, the theory of reals with addition, is decidable, and we shall show that it inherently requires nondeterministic exponential time. If the reals are replaced by
0,
1),
the rationals,
we
get the
same
true statements, since without multiplication,
impossible to find a statement like 3x(x * x the rationals.
The theory
Presburger arithmetic,
is
=
2) that
is
it is
true for the reals but not
of integers with addition (Z, +, =, <, 0, 1), called and is known to require doubly exponential
decidable,
nondeterministic time. That
is,
2
2cn
is
a lower bound on the nondeterministic time
complexity of Presburger arithmetic.
Example
13.6
Before proceeding,
let
theory of reals with addition. Vx3y(y
us consider a
=x+
number
1) is true:
it
of examples in the
says that x
+
1
is
a real
13.6
whenever x
is,
also true:
= yv
states that
it
the reals are dense.
3z(x
< zaz <
between two
is false,
since for every real
how
v
3z(y
different reals
number y
< yvx =
there
to decide whether a statement
is
< z az < we can
x)]
find a third real; that
is
y)
a greater
real.
Note
true; the decision
the properties of real numbers, with which
A
y)
The statement 3yVx(x
told
355
is.
VxVy[x is
SOME PROVABLY INTRACTABLE PROBLEMS
|
we assume
that we have not depends on knowing
the reader
is
familiar.
decision procedure for the reals with addition
We shall begin
our study of the reals with addition by giving a decision procedure and doubly exponential time. To begin, let us put our given statement in prenex normal form, where all quantifiers apply to the whole expression. It is easy to obtain an expression in this form if we first rename
that requires exponential space
quantified variables so they are unique, and then apply the identities
-|(Vx(£))
=
3x(-i£)
Vx(E l )vE 2
=
Vx^ v E
2
)
and four similar rules obtained from these by interchanging V and 3 and/or replacing v by a This process does not more than double the length of the expression; the only symbols that might be added to the expression are a pair of parentheses per quantifier.! Now we have a formula .
Ql*lQ 2 *2
Gm*m^(*l,
*2>
(13.5)
*m)>
where the Q/s are quantifiers, and the formula F has no quantifiers. F is therefore a Boolean expression whose operands are atoms, an atom being a Boolean constant or an expression of the form E 1 op E 2 where op is = or < and E and E 2 are sums of variables and the constants 0 and 1. We know F is of this form because no other combination of operators make sense. That is, + can be applied only to variables and constants, < and = relate only arithmetic expressions, and the Boolean operators can be applied sensibly only to expressions that have true/false ,
y
as possible values.
To determine the truth or falsehood of (13.5) we repeatedly substitute for the innermost quantifier a bounded quantification, which is the logical "or" (in place of
3)
finite number of terms. Suppose in x m _i. Every atom involving x m can be put
or "and" (for V) of a large but
the values of x u t Technically the
we encode
in
x2
,
(13.5)
we
in the
renaming of variables may increase the length of the formula by a log n
factor
fix
form when
a fixed alphabet. However, the complexity depends on the original number of symbols
and not the length of the encoded
string.
356
INTRACTABLE PROBLEMS
x m op
f,
where op
>
<, =, or
is
and
m-
1
Z i=
+
Co
of the form
is
t
C X i> i
1
t h 1
where the
<
fi
c,'s
t
Suppose
are rationals.
< " < tk for < x m < t i+ u '
t2
all
these atoms are x m op
the given values of
.
.
.
,
f
.
Representative values of
Fig. 13.6
As the values of x However, trying x m {
r,'s.
xm
= ± oo,t we know
if
.
xm _
,
.
=
will vary,
j
for
tt
each
we do not xm
/,
=
^(f f
know the
really
+
t
})
order of the
each
for
i
±
and
no matter what the order of the f 's, we are sure to have each interval of Fig. 13.6 and also at the r,'s themselves, = operator may become true.
where atoms with the follows that
.
that
a representative x m in It
,
x,
f
Qm =
3,
then 3x m F(x ly
F(x ,...,x m _ )= 1
.
..,
V
1
x m ) may be replaced by F(x lf
...
xJ,
f
(13.6)
x m = t, or x m = (l/2)(t, + tj) or x m = ±oo
that
is,
by the
logical "or" of k(k
substitution for
may k
>
xm
.
If
Qm =
V,
4-
l)/2
+
F
with a
a substituting
for v,
2 terms, each of
a similar replacement, with
which
is
be made. If
F has k atoms,
3.
Also,
most
if
r bits each,
l)/2
+
atoms, which
2]
atoms of F
is
at
most k 3 atoms
for
are each the ratio of integers of at
then after grouping terms, solving for x m
,
and computing the
we find that the coefficients in the atoms of F will be ratios of with no more than 4r + bits. This follows since if a, b, c, and d are r-bit
average of two integers
F has k[k(k +
the coefficients in the
f,'s,
1
integers,
a c
bd .x m = +oo, then x m = t and analogous simplifications occur.
t If
.x
m
<
t
are
false,
=
ac b~d
and
.x
m
>
t
is
true independently of
t.
If
xm
= - 00
1
13.6
is
SOME PROVABLY INTRACTABLE PROBLEMS
|
most 2r
the ratio of integers with at
a b the ratio of a (2r
is
coefficients in If
we
F
4- l)-bit
c
±
=
±bc
ad
bd
and a
integer
no more than
are
and
bits,
d
357
five
2r-bit integer.
For
>
r
1,
then, the
times the length of the coefficients in F.
repeat the above process to eliminate
and
the quantifiers
all
variables,
we
eventually produce a formula with only logical operators, =, <, and conm stants. The constants are ratios of integers with at most 5 r bits. The number of
atoms
is
at
most (...((fc
m
3
3
j
3
)...)
=
fc
3 -.
times
between constants of 5 m r bits, and k, m, and r are less is 2Cn than m, the length of the expression is at most 2 for some constant c (note that 22 3" < 2 "). We may evaluate an atom of the form a/b < c/d by computing ad — be n and comparing it with 0. Thus the entire final expression may be evaluated in the
As each atom
square of constant
a relation
length.
its
Hence our decision procedure takes 2
2dn
time for
some
d.
The procedure as we have given it also takes doubly exponential space. However, we can reduce the space to a single exponential by evaluating F recursively. for
each x
.
f
We
have already seen that we need consider only a
The values
for x, are given
where the a/s are rationals that are
finite set
by a formula of the form a 0
ratios of 5
m ~ j+
1
r-bit integers,
of values
+ Jj=\ where
r
ajXj, is
the
number of bits in the largest constant of the original formula, F; note r < log n. Thus values for x are rationals that are at most ratios of 5 m r-bit integers, the m+ r-bit integers, etc. Thus we need only cycle values for x 2 are ratios of at most 5 Y
through values
for
each
x, that are at
most 5 2m r
bits.
We use a recursive procedure
EVAL(G) that determines whether G is true when the variables take on the values ±oo and any ratio of 5 2m r-bit integers. If G has no quantifiers, then it consists only of arithmetic and logical relations among rationals, so its truth can be determined directly. If G = Vx(G'), EVAL(G) calls EVAL(G") for all G" formed from G' by replacing x by ±oo or a ratio of 2m r-bit integers. EVAL(G) is true if EVAL(G") returns true for all these expres5 sions G". If G = 3x(G'), we do the same, but EVAL(G) returns true whenever some EVAL(G") is true. It is
eously.
easy to check that no
The arguments
stack takes 0(m5 cn
F
in
A
lower bound
space 2
We now
2m
r)
space. Thus,
and time 2
show
more than
m
for the active calls to 2dn
for
if
copies of EVAL are active simultanEVAL can be put on a stack, and this
F is an expression
some constants
c
of length
and
n,
we may evaluate
d.
that the theory of reals with addition requires essentially non-
deterministic exponential time.
A
series of
lemmas are needed showing
that multi-
x
INTRACTABLE PROBLEMS
358
plication
and exponentiation by limited
can be expressed by short
size integers
formulas.
M
There exists c> 0 such that for each n there is a formula n (x, y, and only if x is a nonnegative integer strictly less than 2 2 and xy = z. Furthermore, < c(n -hi), and n (x, y, z) can be constructed n (x, y, z) from n in time polynomial in n.
Lemma z) that
13.10
is
true
if
",
|
Proof (x
=
1
For n = a z = y).
0,
2
20
M
=
M
\
Thus
2.
M
=
can be expressed as (x
0 (x, y, z)
M
0az
=
0)
v
M
Let x be an integer less than k+i from k ). x 1? x 2 * 3 x 4 < 2 2 such that x = x x 2 + x 3 + x 4 In proof, let x = x 2 = [y/x\. Now z = xy can be expressed by z = x (x 2 y)-f x 3 }? + X4 y. Thus Inductive step: (Construction of
2
2k+1
.
There
"
exist integers
,
,
.
l
1
t
M k+
t
(x, y, z)
=
3wj
ax = a That
M
3u 5 3x { •• 3x 4 [M (x 1 ,
•••
x2
fc
uj
+
k (x 4 , y,
+
x3
x4
m 5 )az
aM
=
u3
,
)AM k (x u
& (x 2 , y,
u2
+ m4 +
w 5]
u2 u3 ) ,
AM
k (x 3
,
y,
u4 )
(13.7)
is,
w1
=x
u2
=
1
x2
x
,
x 2 y,
w3
=
X!
=
x 1 x 2 y,
x2
4-
3
+
x4
w4
=
,
x 3 y,
w5
=
x 4 y,
and z
The condition
= XjX 2 y +
+
x3 y
x 4 y.
an integer
2k
is enforced by each X; being argument of some k Formula (13.7) has five copies of k so it appears that k + must be at least not five times as long as n exponential in n, k This would make the length of linear as we asserted. However, we can use the "trick" of Theorem 13. 10 to replace several copies of one predicate by a single copy. That is, we may write
the
that each x,
is
M
first
M
less
than 2
.
M
M
,
M
.
y, z)
=
3x 4
3w 5 3xj
3i
[x
=
Wx
+
a Vr Vs Vf[—ir
X3
=
+
x As l
=
=
W3
+
W4
x 2 At
=
ul)
X4 A Z
— = x a 5 = y a = u2 a — = x AS = u At = u a — = x 3 a 5 = y A = w4 a
i(r
i(r
\(r
a
t
2
x
2
t
— = X4 A 5 = y A = u 5 i(r
vM
x
t
)
3)
)
)
k (r, s, t)]],
which has a constant number of symbols more than
M
k
does.
+
U5
13.6
One minor
point
that
is
SOME PROVABLY INTRACTABLE PROBLEMS
|
if
we introduce new
variable
names
shall eventually introduce a log n factor into the length of
names must be coded However, the scope
coded length
its
Observe that
make
M
each
M
k,
we
since variable
language of true formulas.
in a fixed alphabet in the
new
and
conflict with the free variables x, y,
and
for n,
rules for quantified formulas allow us to reuse variables
subject to the restriction that the twelve
variables,
M
359
n (x, 0,
M
proportional to the
is
x
0) states that
don't
k
requires only 15 different
n
number of symbols.
an integer
is
M
variables introduced in
Thus
z.
less
than 2 2 ". Thus
we can
statements about small integers in the theory of reals with addition by using
very short formulas.
Lemma
There
13.11
c>
exists a constant
0 such that for every n there
is
a
and only if x and z are integers in the range 0 < x, 2 x z < 2 " and y = z. Furthermore \P n < c(n + 1) and P n can be constructed from n in time polynomial in n.
P n (x,
formula
y, z)
that
true
is
if
\
We
Proof
such that
doing
for
has both exponentiation and multiplication built into
this
is
that
we wish
Ek
to express
could not do
this
Pk
with
The formula £k (x,
<
x, z
2
<
2 \ z
For k
Basis
=
To
Induction
=
y*,
,
Ek
<
0
u
2 \ and uv
,
=
if
if
£k _ and E k _ j. We y
M
P k _ and k _ v and u are integers,
involves both
and only
w)
The reason
x, z,
l
w.
0,
E0 =
(x
= 0az=
1)
v
(x
=
1
construct
to express the conditions fc
be true
will
2
<
1
ay
£fc+1 (x, £k (0,
£ we may
w)
y, z, u, v,
terms of one copy of
in
Pk
since a formula for
y, z, u, v,
it.
terms of several copies of
in
then use universal quantification to express
0
£ k (x,
construct by induction on/ca sequence of formulas
Ek
on
=
v
az
=
y, z, w, v,
0, 0, w,
w, u,
=
(x
and was
a>>
1
1)a
M
=
w)
=
in
Lemma
assert that there exist integers
az =
0)
0 (w, v, w).
we can
w)
0
M„(u,
use the fact that v,
w)
13.10.
x u x 2 x 3 x4 ,
,
in
Using several copies of 2k the range 0 < x, < 2
such that
x Finally,
of
we
E k and
=
x x2 1
+ x3 +
This asserts that z
To improve for
x
+
y
=
number of additional symbols.
x,
=
x
y
,
for
=
and x and
readability of
x < y
y, z)
x
<
what
yvx =
£„(x, y,
x y *y *.
xl X2 x
(y
use the "trick" of Theorem 13.10 to express
a constant
F n (x,
2x
x
and
x4
)
Ek+1
Last,
in
y,
we
and x
write
z, 0, 0, 0).
z are integers in the range
follows,
terms of one copy
we may
<
0
<
x, z
<
2
+
1,
y) a y <
z.
use the abbreviations 2 for
y
<
z for (x
=
yvx <
2 ".
1
INTRACTABLE PROBLEMS
360
Expanding an abbreviated formula results in at most multiplying the length by a constant factor. In addition to the above abbreviations, we shall use constants like 2" and multiplications like ab in formulas. Technically these must be replaced by introducing an existentially quantified variable, say x, and asserting x = 2" or x = ab by P n (n, 2, x) or n (a, b, x). This can also increase the length of the formula by a constant factor. We intend to encode Turing machine computations as integers. Let be a 2" time-bounded NTM. If the total number of tape symbols, composite symbols, and is an integer x in the range the marker # is b, then a computation of " + 1)2+ *. Asserting that an integer is a computation is facilitated by a 0 < x < b (2
M
M
M
predicate that interrogates the ith digit in the fr-ary representation of x.
Lemma
For each n and b there
13.12
such that there 0
b
{2n+ l)2+
\ 0
and
+
and
1),
c,
depending only on
Dn b
b,
i
i
low-order end of the c(n
exists a constant
D n b (x, ij) that is true if and only if x and are integers, < < 2", and x h the (i + l)th digit of x counting from the
a formula
is
fr-ary
representation of x,
is
can be constructed from n and b
Furthermore \D nb < time polynomial in n \
in
b.
i2n+ 1)2+
2sn
< 2 for all n. For each b there exists a constant s such that b 2 Thus that x is an integer in the correct range can be expressed by 3m[P sn ((2" + l) b, m) a0 < x < m\ (Recall our previous remarks concerning constants like 2" and their expansions.) That is an integer in the range 0 < i < 2" can be expressed by i + 1 < < 2"). Now x in base b has zeros in positions 1, 2, n (U 0, 0) a (0 i+l integers there exist if if and only Thus x =j divisible by b if it is if and only q i+ + r and y7>'* < r < (j + \)b\ This fact is easily expressed and r such that x = qb using P sn and sn Proof
1
,
i
M
i
.
f
1
M
Theorem
13.16
.
Any
nondeterministic algorithm to decide whether a formula
the first-order theory of reals with addition
take 2
cn
number of
steps for an infinite
is
true must, for
some constant
in
c> 0,
n's.
M
be an The proof is quite similar in spirit to that of Theorem 13.1. Let n consist of 2" bounded NTM. Here ID's in a computation of symbols rather than p(n) as in Theorem 13.1. Let the total number of tape symon input of bols, composite symbols, and #'s be b. Then a computation of
Proof
M
arbitrary 2 -time
M
length n consists of [(2
n
+
l)
2
+
We may consider this computation
1] fr-ary digits.
< <
2,n
s. For convenience, end of the computation. Let x be an input of length n to M. We construct a formula F x that is true if accepts x. Fx is of the form 3i(. .), where the formula within the and only if parentheses asserts i is an accepting computation of x. This formula is analogous
to be an integer
we take
i
in the
range 0
the low-order digits of
i
/
2
M
to that in
Theorem
for
to be at the
some constant
left
.
13.1.
The
first
n
+
1
symbols of the computation are
#[g 0 a l9 m]a 2 '~a n ,
,
13.6
SOME PROVABLY INTRACTABLE PROBLEMS
|
361
assuming that x = a l a 2 "-a n q 0 is the initial state, and m is any choice of move. To say that the first n + 1 symbols of the computation are correct, we say that there exist u and j such that the value of u represents #[q 0 a x m]a 2 an n+ l for some w, and i = b j + u for some integer j. We must write this formula in 0(n)-space in time that is polynomial in n. By induction on k = 2, 3, n + 1 we can write a formula C k (v) with free variable v, which asserts that the value of v is the numerical value of the first k symbols of the computation. For the basis, k = 2, we simply write a formula ,
first
' *
. .
.
•
,
,
,
C 2 (v) =
= p vv =
(v
p2
x
v-vv = p„),
where the p/s are the integers represented by #[g 0
m]
>
f° r tne finite set of
values of m. For the induction,
Ck (v) = where
t
3w(C k _ (w) a i
= 6w +
v
ak _
x ),
taken to be the numerical value of tape symbol a k _ ^ To avoid using C n+1 which would make its length 0(n log ri), we alternate
is
n variables to express
,
between two variables, such as v and w, as we construct C 2 C 3 The desired formula asserts Cn+ (u) and i = b n+ lj + w for integer ,
,
C n+1
t
assertion
similar to
is
what was done
Lemma
in
and the technique
13.12,
.
The
latter
will
not
be repeated here.
To
express that the
ID was
initial
correct in
Theorem
13.1 required asserting
that "approximately" p(n) cells contained the blank symbol. This
plished by the logical
v
We
of p(n) items.
must now
assert that
was accom-
about
2" cells
we cannot use a logical v of 2" formulas; would be too long a formula. Instead we use the quantifier V/ and assert that
contain the blank symbol, and thus this
either j
is
not an integer
Vj[nM s„(;, The formulas
0,
ID
0.
0)
Thus we
+
v
ID
that force the last
follow from the previous
previous
range n
in the
we denote by
blank, which
ID
+
2
2"
+
1
or the ;th symbol
is
the
write
2
2"
+
1)
v D n>b (i, j,
0)].
and force each ID to move embedded in the of Theorem 13.1. Having
to contain a final state
because of the choice of
are similarly translated from the techniques
done this, we have a formula E x whose length is proportional to n, that is true if and only if accepts x. 12 Suppose accepts a language L in time 2" not accepted by any 2" timebounded NTM. (The existence of such a language follows from the NTIME hierarchy of Cook [1973a], which we have not proved.) We can recognize L as follows. Given x of length n, produce the formula E x that is true if and only ifx is ,
M M
in L.
Now,
if
T(n) nondeterministic time suffices to accept the set of true formulas
in the first-order in
time p(n)
theory of reals with addition,
T(cn).
Then
we could recognize L for some d > 0.
12.3 i.o.
+
p(n) in
+
T(cn)
>
we may determine whether x
2" 12 for an infinity of h's, else
time at most 2
n/2 ,
for all n. It follows that
is
in
L
Lemma dn T(n) > 2
by
INTRACTABLE PROBLEMS
362
Corollary The theory of reals with addition is nondeterministic exponential time-hard with respect to polynomial time reductions.
The proof
Proof
an easy generalization of the foregoing reduction of a
is
THE & =
13.7
LIMITS
i
.
2"
TM.
nondeterministic time
QUESTION FOR TURING MACHINES WITH ORACLES: ABILITY TO TELL WHETHER 0> = 10>
W>
ON OUR
.
The reader should
recall from Section 8.9 our discussion of Turing machines with These TM's had associated languages, called oracles, and had special states in which the membership of the string written to the left of their head could be tested in one step for membership in the oracle. Any oracle can have any oracle "plugged in," although its behavior will naturally vary depending on the A oracle chosen. If A is an oracle, we use for with oracle A. The time taken by is one step for each query to the oracle and one step for each an oracle
oracles.
TM
M
M
TM
ordinary
We
move
of the
TM.
&A
to be the set of languages accepted in polynomial time by A with oracle A. Also define Jf& to be the set of languages accepted by define
DTM's NTM's with oracle A in polynomial time. We shall prove that there are oracles A JT 0> B This result has implications regardand B for which 0> A = ./f& A and 0> B .
ing our ability to solve the ?f
=
Jf0> question
for
TM's without
oracles. Intui-
one way or the other will work when arbitrary oracles are attached. But the existence of A and B tells us that no such method can work for arbitrary oracles. Thus existing methods are probably = Jf&. We shall provide details along these lines insufficient to settle whether after we see the constructions of A and B. tively all
An
known methods
oracle for which 0*
Theorem
13.17
=
to resolve the question
.
\
0>
& A = jV'& a
,
where A
= Lqbf
the set of
,
all
true quantified Bool-
ean formulas (or any other PSPACE-complete problem).
Proof
Let
L = L(M A
).
MA
be
Then
MA
polynomial
nondeterministic queries
its
oracle a polynomial
time
bounded,
number
and
let
of times on strings
MA
whose lengths are bounded by a polynomial of the length of the input to Thus we may simulate the oracle computation in polynomial space. It follows A that J 9> c PSPACE. However, any language L in PSPACE is accepted by some DTM A that reduces L to A in polynomial time and then queries its oracle. A A A Thus PSPACE c &> A But clearly &> A c so 0> = V'P .
M
.
An
oracle for which
We now will
yp
,
.
.
&f W .
I
B
which SP B ± ~\r @ & most one word of any length; exactly which words will be discussed
show how
have at
m
to construct an oracle
Bc
(0 -f 1)* for
>
13.7
later.
& = Jf &
THE
|
We
shall
QUESTION FOR TURING MACHINES WITH ORACLES
be interested
L=
We may
easily construct
|
an
+
language
in the
{0'*
B
363
has a word of length
NTM
B
with oracle
i}.
that, given input 0', guesses
a
and queries its oracle about the guessed string, accepting if the oracle says "yes." Thus L is in Jf& B However, we can construct B string of length
in
i
(0
1)*
.
so that the string of each length,
B
oracle
cannot find
Theorem
There
13.18
is
so cleverly hidden that a
is
DTM
with
polynomial time.
in
it
any,
if
an oracle
B
for
which 0> B
+ J r 0> 3
.
Proof We shall give a procedure to enumerate the set B. Set B will have at most one word of any length. As we generate S, we keep a list of forbidden words; these words are ruled out of consideration for possible membership in B. Assume an enumeration of DTM's with oracle and input alphabet {0, 1}, in which each
TM
appears
infinitely often.
We
M
consider each
i
i9
—
When M,
in turn.
1, 2,
is
have generated some forbidden words and a set B of words so far in B. There will be at most one word in B of length 0, 1, i — 1, and no longer words. Furthermore, no other words of length less than i will subsequently be put in B. We simulate Mf' on input 0\ If M, queries a word of length less than i, we consult Bh which is all words in B so far, to see if the oracle responds "yes" or "no." If M, queries a word y of length i or more, we assume that y is not in B (i.e., answer "no") and to make sure y is not later placed in B, add y to the list of
considered
we
shall
{
. .
t
.
,
forbidden words.
Mf
The simulation
of
M
we make a
•'
on
0'
log
continues for
1
i
steps. Afterwards,
whether or
word to put in B. If within / ,ogI that is not on the steps, Mf halts and rejects 0\ then we put a word of length forbidden list in 5, provided there is such a word. The word may be picked arbitrarily, say the lexicographically first word that is not forbidden. If Mf does ,og is placed in B. not reject 0' within steps, then no word of length There is also no word of length in B if all words of length i are forbidden by Bj the time we finish simulating Mf\ However, the number of steps simulated for logj is j so the total number of words of all lengths forbidden by M„ .., M, is 2 at most not
t
has halted,
decision about a
{
i
'
'
i
z
i
M
M
,
£ As there are forbidden for
i
>
2'
if 2'
32, so
> it
f°*J
words of length 1
+logi
i
is
,
that
only for a
is, if
i,
i
finite
<
|(/
,0 «'")
+
(1
log
.
+
<
we know
>
,
that not all i)
log
number of small
i.
words of length
But the
latter relation
fs that all
i
are
holds
words of length
i
could be forbidden.
Having
finished the simulation of
selected word,
are
now
if
there
is
Mf
«
on
0' for
,og '
/
new set B i+ ,+ MfVi on 0
one, obtaining us a
ready to repeat the process for
1
1
.
1
we generate
the
of generated words.
We
steps,
INTRACTABLE PROBLEMS
364
Next we define a language
L=
We may
{0'
|
Jf@ B —
that
B
has a word of length
is
easily construct a linear time
nondeterministically guesses string
w
in
NTM
of length
i
in (0
TM
time-bounded
L
MB
.
with oracle B. As each
>
.
i}.
with oracle
about w, accepting if the oracle says "yes." Thus Suppose L is in 0> B Let Ml accept L, where p(n)
0> B Let
L
+
is is
1)*
in
B
that, given input 0',
and queries
jV& b
its
oracle
.
a deterministic polynomial
TM has arbitrarily long codes, we
MB
]ogk
> p(k). If accepts 0\ then 0* is in L, so £ k That means rejects 0*. But Mf and Mf k must behave identically on input 0\ since B and £ agree on words shorter than k, and B has no Bk word of length k or more that is queried by on 0*. Thus Mf rejects 0*, a may
pick k such that k
has a word of length
32 and k
k.
fc
M
contradiction. If
M B rejects 0\ so 0* >
is
not
in
L, then
M
Bk
cannot
logk reject 0* within k steps.
and had Mf rejected 0* within /c Iog * steps, there would still be a word of length k not on the forbidden list, and that word would be in B. B k ]ogk steps. But as Thus 0* would be in L. Hence does not reject 0 within k This follows since k
k
]ogk
L
is
> in
fc
32,
M M B does not reject 0 at k
p(/c),
-
J
eP
all,
another contradiction.
We conclude that
B .
Significance of oracle results
Let us consider the ways used in this book to show two language classes to be the same or different, and see why Theorems 13.17 and 13.18 suggest that these J = .YiP question. We showed certain classes to methods will fail to resolve the f be the same by simulation. For example, Chapter 7 contains many simulations of one type of TM by another. Chapter 5 contained simulations of PDA's by CFG's and conversely. Suppose we could simulate arbitrary polynomial time-bounded NTM's by ,
polynomial time-bounded DTIvTs. (Note that giving a polynomial-time algorithm
any one NP-complete problem is in effect a polynomial-time simulation of all NTM's.) It is likely that the simulation would still be valid if we attached the same oracle to each TM. For example, all the simulations of Chapter 7 are still valid if we use oracle TM's. But then we would have tf B = J d? B which was just shown to for
,
be
false.
Other
classes of languages
were shown unequal by diagonalization. The hier12.9, and the proof that 1^ is an r.e. set but not
archy theorems, Theorems 12.8 and a recursive
set are
prime examples. Diagonalizations also tend to work when
we could show a language to be in Jf & — &>, then the same proof A A r might well work to show J # — & ± 0. This would violate Theorem 13.17. We also used translation lemmas to refine time and space hierarchies in Chapter 12. Could these help show & + oY&l Probably not, because the translation lemmas also hold when oracles are attached. oracles are attached, at least in the three examples cited above. If
c|jagonalize over
&
to
EXERCISES
365
we can use closure properties to show a difference between two lanFor example, the DCFL's are contained in the CFL's, but the DCFL's are closed under complementation and the CFL's are not. This proves that there is a CFL that is not a DCFL. Could we find a closure property of & that is not Lastly,
guage
classes.
Jf&l
shared by that 0>
This at
appears the most promising approach. While proofs
first
closed under an operation are likely to
show
&A
is closed under Jf@> might not carry over to Jf& A On the other hand, showing Jf0> not closed under an operation involves showing a particular language not to be in JT0>. This proof might be accomplished by A diagonalization, but then it would likely carry over to Jf£P It might be done by developing a pumping lemma for JfgP, but this seems well beyond present capability. Finally, we might develop some ad hoc argument, but again, no such arguments have been found, and they appear very difficult.
is
also that
that operation, a nonclosure result for
.
.
EXERCISES 13.1
Suppose there
is
a 2" time-bounded reduction of Li to
What can we conclude about L 13.2 a)
*h)
Which of X! ax 3 a(x 2
A
is
DTIME(2 n
in
).
?
)
(xi.vXjjVxjA
A
vx vx l2
i3 )
i|.«2.»J
where
A
,
the following Boolean formulas are satisfiable.
vx 3
i|.l2.«J
13.3
x
L 2 and L 2
(i'i,
i
2 , 13)
clique in a
ranges over
graph
G
is
all triples
connected by an edge. The clique problem clique of given size
of three distinct integers between
a subgraph of
G that
is
is
complete;
to determine
i.e.,
a given graph
if
and
1
5.
each pair of vertices
G
is
contains a
k.
Formulate the clique problem as a language recognition problem. b) Prove that the clique problem is AfP-complete by reducing the vertex cover problem to a)
the clique problem.
[Hint: Consider a graph
G
G and
its
complement graph G, where G has an edge
if
and only
if
does not have that edge.]
13.4
Given a graph
G
G and
integer
k,
the clique cover problem
G
is
to determine
if
there exist
Prove that the clique cover problem is /VP-complete by reducing the vertex cover problem to the vertex cover problem for graphs without triangles, thence to the clique cover problem. [Him: k cliques in
such that each vertex of
Consider graphs G' 13.5 a)
Does
=
G=
(V, E)
is
in at least
one of the k
cliques.
and
(£, {(e u e 2 )\e x , e 2 are incident
upon
the
same
vertex in G})].
the graph of Fig. 13.7
have a Hamilton circuit?
b) a vertex cover of size 10? c) a vertex
coloring with 2 colors such that
no two adjacent
vertices are the
same color?
Prove that the chromatic number problem is /VP-complete by reducing the 3-CNF satisfiability problem to the chromatic number problem. [Hint: The graph in Fig. 13.8 can 13.6
INTRACTABLE PROBLEMS
366
An
Fig. 13.7
undirected graph.
(b)
(:0
V= /•-
=
\v
r
xr
!
I
;
{(>',.
Graph used
Fig. 13.8
v
I
to
/
^/}
U
{( Vf
.
v
) ;
|
1
< /<
//}
u
v ;
),
.
;
(r
r
v
)
I
y
in /.
+
color n 13.7 a)
/,
1
< < i
n,
(a) i
Complete
^ j.
your construction. Note that each v, must be colored with a The entire graph can be colored with n + colors if and only if one of x and ic, is colored with color and the other is colored with
be used as a subgraph
distinct color, say color
each
+r
show chromatic number problem /VP-complete, (b) x, and x, are connected to all v } for which
graph on n vertices;
for
/
1
/'
£
1.]
Show
that the following
problems are MP-complete.
Given a graph G, with integer distances on the edges, and two integers / and d, is there a to select / vertices of G on which to locate "firehouses," so that no vertex is at distance more than d from a firehouse?
way
EXERCISES
**b) The one-register code generation problem. Suppose
we have a computer with one
367
reg-
and instructions
ister
LOAD m STORE m OP m
memory
bring the value in
location
store the value of the register in
may
apply OP, which
m
to the register
memory
location
m
be any binary operator, with the register as
argument and location
m
left
as right argument; leave the result in the
register.
Given an arithmetic expression, each of whose operands denotes a memory location and given a constant /c, is there a program that evaluates the expression in k or fewer instructions?
The
**c)
unit execution time scheduling problem.
processors
p,
a time limit
r,
and a
Given a
Tu ...,Tky a number of < 7}, meaning that task
of tasks
set
set of constraints of the
form
7]
Ti must be processed before 7}, does there exist a schedule, that is, an assignment of at most one task to any processor at any time unit, so that if 7] < 7} is a constraint, then 7j is assigned an earlier time unit than 7}, and within t time units each task has been assigned a processor for one time unit? **d) The exact cover problem. Given a set S and a set of subsets S U S 2 S k of 5, is there a subset T ^ {Si, S 2 S k } so that each x in S is in exactly one 5,- in 7? ,
.
.
,
,
The spanning
13.8
spanning
tree problem.
Determine whether a
Give a log-space reduction of the Hamilton
a)
tree
T
isomorphic to some
is
tree of G.
problem
circuit
to the
spanning
tree
problem.
Give a
*b)
direct log-space reduction of
3-CNF
satisfiability to the
spanning
tree
problem.
13.9
An
a)
n-dimensional grid
is
a graph
^= and E in that
*b) Let
G
=
{(«i. «2,
G=
where
(V, E)
•••,'„)!
1
1
{(v u v 2 ) v and v 2 differ in only one coordinate, and the difference in v and v 2 coordinate is one}. For what values of m } and n does G have a Hamilton circuit? \
Y
x
be a graph whose vertices are the squares of an 8 x 8 chess board and whose
edges are the legal moves of the knight. Find a Hamilton circuit
planar graphs. [Hint: First
in G.
problem is /VP-complete even when restricted to show that the Hamilton circuit problem is /VP-complete for
Prove that the Hamilton
*1 3.10
circuit
planar graphs with "constraints," by reducing
L 3i a ,
,
to
it.
In particular, consider the class of
planar graphs with constraint arrows connecting certain pairs of edges. Constraint arrows are allowed to cross each other but cannot cross edges of the graph.
Show
that the existence
of Hamilton circuits that use exactly one edge from each pair of constrained edges
is
/VP-complete. Then replace the constraint arrows one by one by graph edges by the
may cross a graph edge but any Hamilton circuit. These crossings can be removed by the substitution of Fig. 13.9(b). The graph of Fig. 13.10 may be helpful in the first step of the hint to represent a clause x + y + z]
substitution of Fig. 13.9(a). In the process, a constraint arrow
only
*13.1
1
if
the graph edge
A
graph
is
must be present
4-connected
if
in
removal of any three vertices and the incident edges leaves Hamilton circuit problem is NP-complete even for
the graph connected. Prove that the
4-connected graphs. [Hint: Construct a subgraph with four distinguished vertices that can replace a vertex in an arbitrary graph
G
so that even
if
additional edges are added from the
INTRACTABLE PROBLEMS
368
Fig. 13.10
Graph used
in
the construction of Exercise 13.10.
four distinguished vertices to other vertices of G, the resulting graph will have a Hamilton circuit
13.12
if
and only
if
G
did.]
Prove that the problem of determining whether a
a solution with k
components of x equal
constrained to 0 or
1,
zero
if
*13.13 1)
more than
and only
A
there
if
y
=
1,
Y
or
is
1
Ax =
f
can be replaced by
i
1
t
b has
x are
If the
1,
4,
2,
or
3.
has no
and has a solution with exactly four variables
3.]
no arc from a vertex
2) every vertex
+ x2 + x3 >
,
four variables zero 2,
equations
+ x 2 + x 3 = 4, provided y is constrained to be + z 2 = 3, y = z 3 + z 4 and z + z,= 1, < <
kernel of a directed graph is
set of linear
/VP-complete. [Hint:
is
then an inequality of the form x,
an equation of the form y + x, The system of equations y + z solution with
to zero
is
a set of vertices such that
the kernel to another vertex in the kernel,
in
either in the kernel or has an arc into
it
and
from the kernel.
Prove that determining whether a directed graph has a kernel is /VP-complete. [Hint: may have only one vertex in a kernel.]
Observe that a cycle of length two or three 13.14
**13.15
Prove that the traveling salesman problem
is
/VP-complete.
Consider approximations to the traveling salesman problem.
Show
that the exist-
ence of a polynomial-time algorithm that produces a tour within twice the cost of the
optimal tour would imply *S13.16
&=
.A y/.
Consider the traveling salesman problem where the distances
inequality, that
d(v u
i;
3
)
<
d(v u v 2 )
+
d(v 2
Give a polynomial-time algorithm to find a tour that tour.
satisfy the triangle
is
is
,
v^).
within twice the cost of the optimal
EXERCISES
Suppose there
*13.17 that a)
polynomial-time algorithm for finding a clique
exists a
in
369
a graph
of size at least one-half the size of the maximal clique.
is
Prove that there would
exist a
polynomial-time algorithm for finding a clique which
is
of size at least l/y/2 times the size of the maximal clique. [Hint: Consider replacing
each vertex of a graph by a copy of the graph.] clique which
<
any k
b) Prove that for is
there
1
would
a polynomial-time algorithm for finding a
exist
of size at least k times the size of the maximal clique.
13.18
Prove that it is NP-complete to determine whether the chromatic number of a graph is less than or equal to 3. [Hint: The graph of Fig. 13.1 1 can be used as a weak form of an OR gate when only three colors are available, in the sense that the output can be colored
"true"
and only
if
at least
if
one input
For n >
G„
=
(K„, £„)
K=
{('\
h
k)
En =
{(«,
X m (G) be
a) Let
vertex of that
b)
6, let
G
1,
1
j,
in
Exercise 13.18.
be the graph where
k are distinct elements of {1, 2,
n)
and
n
.
.,
n}},
X A (G = n)
m distinct colors to each
of colors needed to assign
so that no two adjacent vertices have a color
X 3 (G =
.
v)\u and v are disjoint triples}.
minimum number
the
colored "true."]
Graph used
Fig. 13.11
*13.19
is
in
common. Prove
for n
>
6
2/7-4.
Suppose there were a polynomial-time algorithm to color a graph G with at most twice the minimum number of colors needed. Then prove that & = \ [Hint: Combine .
part (a) with Exercise 13.18.]
Construct an algorithm for finding a Hamilton circuit
*13.20
assumption that // circuit exists. If
Note
—
it
.
is
=
.
will find a
\
no Hamilton
Hamilton
circuit exists, the
circuit in
a graph that under the
in
polynomial time whenever such a
algorithm need not run
polynomial time.
in
not sufficient to design a nondeterministic algorithm and then use the hypothesis to claim that there
I
is
a deterministic polynomial-time algorithm.
You must
actually exhibit the potentially deterministic polynomial-time algorithm.
prove
undecidable for
L
j>
whether
L
*13.21
If
*13.22
Prove that the existence of an /VP-complete subset of 0* implies
*13.23
An
1)
cT~
=/=
I
integer n
1
=/=
.
1
mod
is
«,
is
is
composite
if
and only
in
.
S
if
there exists an
=
2'b
a,
1
<
a
is in //.
<
&=
n,
'
.
t
such that either
or
2) there exist integers b If n
it
and
i
where n
—
1
and a b and n have a
composite, at least one-half of the integers between
randomized algorithm that with high probability in polynomial time.
will
1
and
common
n satisfy (1) or
divisor.
(2).
determine whether a number
Give a
is
prime
INTRACTABLE PROBLEMS
370
Suppose there
*13.24
/ mapping
exists a function
integers of length k onto integers of
length k such that 1)
/ is computable
2)
f~
l
is
Prove that
a)
Show
Does
polynomial time.
in
would imply
this
A = 13.25
polynomial time;
in
not computable
{(x, y>|
/"'(*)
<
y}
is
n
in
#
-
Co-.V ^)
problems are PSPACE-complete.
that the following
+ and
a given regular expression (with only the usual operators \
strings over
its
alphabet? [Hint: The proof parallels Theorem
,
*) define all
13.14.]
G with two distinguished vertices s and r, and CUT. Alternately, with SHORT first, the and t. SHORT wins by selecting vertices that,
Sb) The Shannon switching game. Given a graph
SHORT
suppose there are two players
G
players select vertices of
and
5
Can
SHORT
Show
*13.26
there
r,
on
force a win
that
if
PSPACE + L u L2
an enumeration
is
other than s
form a path from
with
L
PSPACE
,
s to
G no >.
t.
of ^, and a computable function
13.27
Give a polynomial-time algorithm
*13.28
Can any
most ten 13.29
QBF
such that for each
X Q X
prenex normal form Q,
2
{
••
2
be converted
i,/(i) is in
L if and
where £
is
/from
only
a path.
is,
integers to
if/(i) is
not
in
Lh
Boolean formula to
for converting a quantified
Q k X k (E), in
make such
by diagonalization. That
f, then there exists a proof
and a
in
if
J
...
strings
set
CUT wins SHORT cannot CUT does?
matter what
a Boolean expression
in
3-CNF.
polynomial time to an equivalent formula with
at
distinct variables?
Show
that the following
problems are complete
for
f J
.
with respect to log-space
reductions.
x and CFG G? Encode a circuit as a sequence C,, C 2y C„, where each C, with j and k less than Given an encoding of is a variable x l7 x 2 or a (_/, k) or a circuit and an assignment of true and false to the variables, is the output of the circuit
a) Is
**b)
x
The
in
L(G)
for string
circuit value problem. .
,
.
.
•
.
,
i.
.
true?
*13.30
Show
that the following
problems are complete
for
NSPACE(log
n) with respect to
log-space reductions. a) Is a
Boolean expression
in
2-CNF
not satisfiable?
b) Is a directed graph strongly connected? c) Is
L(G)
infinite for
CFG G
without (-productions or useless nonterminals?
Given CFG's G and G 2 and integer k, show that the problem of determining whether there are words w, in L(G ) and w 2 in L(G 2 ) that agree on the first k symbols, is complete for nondeterministic exponential time with respect to polynomial-time
*13.31
x
t
reductions. *13.32
Show
that the
problem of determining whether a regular expression with the
section operator permitted denotes c
>
0 and can be solved
in
time 2
all
strings in
its
alphabet requires time 2
C% " i.o.,
for
inter-
some
dn .
13.33 a)
Write a formula
in the
greater than 5
the
is
theory of integers under addition expressing that every integer
sum
of three distinct positive integers.
EXERCISES
b) Write a formula in
and **c)
number theory expressing that d is
common
divisor of a
b.
Write a formula
number theory
in
= xy
expressing that z
Apply the decision procedure of Section
13.34
the greatest
371
.
13.6 for the theory of reals to decide
whether the formula
+
3y 3x[(x
=
y
14)
+
a (3x
y
=
5)] is true.
*13.35 a)
Show
that the theory of Presburger arithmetic (the integers with
requires nondeterministic time 2
2c"
i.o.,
some
for
>
c
formulas of size proportional to n: 2 1) R n (x, y, z): 0 < y < 2 ", and z is the residue of x
<
0 G„(x): x M„(x, y,
2) P„(x): 3) 4)
b)
Show
c)
Use
x
<
2
2
",
and x
is
and z are
mod
y.
a prime.
the smallest integer divisible by
is
z): x, y,
+, =, and <)
[Hint: Develop the following
0.
primes
all
integers in the range
less
0 to 2
22 "
than 2 2 ".
—
1, and xy = z.] 22 and 2 time.
2 that Presburger arithmetic can be decided in 2 " space
the algorithm of part (b) to decide
3x[(x+y =
3y
14)
a (3x
+y=
5)].
Extend Presburger arithmetic to allow quantification over arrays of
*13.36
we could
< <
Vn 3B Vi[—
Vi4
i
1( 1
n)
v
[3/(
1
n)
a
Prove that the theory of Presburger arithmetic with arrays
To show that number theory is x n into an + 1, x 0 x l5
*13.37
integers.
Thus
write formulas such as
of length n
,
,
undecidable,
it is
/4(f)
is
£(;)]].
undecidable.
convenient to encode a sequence
x such that each
integer
=
x,
can be obtained from
x by a formula.
m = max
a) Let
{«,
x 0 x l5
.
,
.
,
.
Prove that the
x„J.
pairwise relatively prime and that
b
<
u0 u x
•
•
=
such that b
u„
b) Express Godel's
x,
>
u,
mod
u iy
x,.
set of u (
=
1
+
(i
+
l)w
!,
0
<
i
<
n,
are
This implies that there exists an integer
0 <
i
<
n.
P function c,
P(b,
0 = b mod
[1
+
(i
+
l)c]
as a predicate. c)
Prove that number theory
Show
*13.38 a) S/
b) // c)
c ,
D
^£
*13.39
.
+ =
.
1
.
I
Show
undecidable.
that there are oracles C, D,
'^S and
i
is
y/
D
co- \
but is
that
.4
are
=
and
£, for
which
all different.
Co-.
I
independent of the axioms of number theory.
&=
Jf't?
if
and only
if
&
is
an AFL.
Solutions to Selected Exercises
13.16
minimum
by increasing cost, and discarding any edge that forms a be the minimum cost of a Hamilton circuit and let 7^ be the cost of the
Construct a
cost spanning tree by sorting the edges
selecting edges starting with the lowest cost edge, cycle. Let
Topt
INTRACTABLE PROBLEMS
372
minimum
cost spanning tree. Clearly 7^
< Topt
,
since a spanning tree can be obtained from
a Hamilton circuit by deleting an edge. Construct a path through
all vertices
of the graph
The path is not a Hamilton circuit, since each edge of the spanning tree is traversed twice. The cost of this path is at most 2TS <2Topt Traverse the path until encountering some edge e leading to a vertex for the second time. Let e 2 be by traversing the spanning
tree.
.
x
on the path. Replace the portion of the path consisting of and e 2 by a single direct edge. By the triangle inequality this cannot increase the cost. Repeat the process of replacing pairs of edges by a single edge until a Hamilton circuit is the edge immediately following e t
ex
obtained.
Shannon switching game is in PSPACE. Consider a game game position. Assume SHORT moves first. The sons of the root correspond to each possible game position after a move of SHORT. In general, a vertex in the tree corresponds to the moves so far (which determine a game position) and 13.25b tree.
First
The
we show
that the
root indicates the initial
the sons of the vertex correspond to the board position after each possible additional move.
leaf
A
position
is
a winning position only
winning positions as follows. it
is
SHORT's move,
position. If
is
it
if SHORT has a forced win from the position. Thus a SHORT has a path from s to We can recursively define
a winning position
is
then v
CUTs
if
t.
If vertex v
not a leaf and corresponds to a position
is
a winning position
is
move, then
if
there exists a son that
is
in
which
a winning
winning position only if every son is a winning most n, the number of vertices of G, a recursive a winning position requires space at most n. Thus the
v is a
position. Since the tree has depth at
algorithm to determine
problem
is
if
the root
is
PSPACE.
in
Shannon switching game
is PSPACE-complete, we reduce the problem to it. Consider a quantified Boolean formula and without loss of generality assume that the quantifiers alternate (otherwise add dummy quantifiers and variables)
To show
the
that
quantified Boolean formula
3xi
Vx 2 3x 3
•••
Vx„_i 3x n F(xi
Consider the graph, called a ladder, shown
in
Fig.
•
13.12,
x„).
where n
=
additional edges (see dashed lines) but they are unimportant for the
SHORT SHORT
plays
first.
He must
some time
at
There
will
be
observation.
This corresponds to
select either x,(l) or x,(l).
selecting a value for the existentially quantified variable x,.
3.
first
The
next four moves
SHORT having selected x,(l), x,(2), and 3xj and CUT having selected x,(l)and x 2 (2), or SHORT having selected x,(l), .^(2), and 3x and CUT having selected x,(l) and x 2 (2). If SHORT does not select one of x,(l), x^l), x (2),x (2), or 3x u then CUT wins. If SHORT selects 3x,, then CUT is given the advantage in selecting Xi(l) are forced, ending
up with
x
1
The purpose of
or xj(l).
CUT the
the vertex 3x t
is
to
consume an additional move
1
of
SHORT,
x 2 (2), x 2 (2)}. This means that CUT selects the value for the universally quantified variable x 2 and so on. graph, Once the values for x l5 x 2 ., x„ have been selected, the dashed portion of the which corresponds to the quantifier-free portion of the formulas, comes into play. Without loss of generality we can assume that F(x u x„) is in conjunctive normal form. Let F = F,aF 2 a ,,, a F n where each F, is a clause. Construct the tree of Fig. 13.13. Identify with vertex 3x„ in Fig. 13.12. From vertex F, add an edge to vertex xj( 1 ) or x,( 1 ) if the root thereby allowing
first
selection from the set {x 2 (l), x 2 (l),
,
,
.
.
,
1
Now observe that SHORT selects vertex CUT can SHORT selects the other. Clearly SHORT can build a path to at
Xj or Xj, respectively, appears in F,. select either
F
Y
or
2,
and
1.
EXERCISES
373
INTRACTABLE PROBLEMS
374
least
one F n and
Now SHORT
CUT can
value one"; that
Observe that
is,
if
SHORT to reach
force
has a path from
SHORT
s to
f
if
F,
is
has selected Xj(l) or
the quantified formula
is
quantified variables, so that regardless of
F
only one F, and can determine which
connected to some
true,
F
f
.
or x,(l) which "has
x,(l).
then
CUTs
SHORT can specify the existentially
choices for the universally quantified
Thus regardless of which F, is forced on SHORT, that F, is true and xj. Hence SHORT can win. On the other hand, if the quantified Boolean formula is false, CUT can select the universally quantified variables so that for the assignment to the x's, F is false. Then CUT forces SHORT to reach only one F„ and in particular an F, that is false for the assignment. Thus SHORT does not complete a path, and CUT wins. Thus SHORT is guaranteed a win if and only if the quantified Boolean formula is true, and hence the Shannon switching game on vertices is complete for PSPACE. variables,
true.
is
hence connected to a selected x } or
BIBLIOGRAPHIC NOTES Cobham
[1964] was the
first
The first NP-complete Theorems 13.1, 13.2, and wide variety of NP-complete
to devote attention to the class
problems, including the versions of the satisfiability problems
in
13.3, were introduced by Cook [1971b]. Karp [1972] gave a problems, and clearly demonstrated the importance of the idea.
include the vertex cover problem
(Theorem
13.4),
Some
of these problems
the clique cover problem (Exercise 13.4),
problem (Exercise 13.7d), the chromatic number problem (Exercise 13.6), Hamilton circuit (Theorem 13.6), and the traveling salesman and partition problems mentioned in Section 13.2. The clique problem (Exercise 13.3) is from Cook [1971]. Theorem 13.7, the /VP-completeness of integer linear programming, is independently due to Gathen and Sieveking [1976] and Borosh and Treybig [1976]. The proof given is from Kannan and Monma [1978]. An enormous number of problems have since been shown /VP-complete, and those problems come from a wide variety of areas. Garey and Johnson [1978] attempt to catalog such problems, and we shall here mention only a sample of the work that has been done and the areas that have been covered. Sethi [1975], and Bruno and Sethi [1976] cover code the exact cover the
generation problems (Exercise 13.7b appears
in the latter). Scheduling problems are conCoffman [1976] and Ullman [1975]; the solution to Exercise 13.7(c) can be found both. Garey, Johnson, and Stockmeyer [1976], and Garey, Graham, and Johnson [1976]
sidered in
in
provide a variety of powerful results, principally for graph problems. Papadimitriou [1976]
and Papadimitriou and Steiglitz [1977] study path problems in graphs. Exercise 13.18 is taken from Stockmeyer [1973], Exercise 13.10 from Garey, Johnson, and Tarjan [1976], and Exercise 13.12
is
by
J.
E. Hopcroft.
NP-complete problems appear in Hunt Hunt and Rosenkrantz [1977], Kirkpatrick and Hell [1978], Lewis [1978], Schaefer [1978], and Yannakakis [1978]. Among the promising approaches to dealing with NP-complete problems is the idea of
A number
and Szymanski
of results showing large classes of [1976],
considering approximate algorithms to the optimization versions of problems. These algorithms run in polynomial time but are guaranteed to come only within some specified range of the optimum. Johnson [1974] considered approximation algorithms for
some
of the
BIBLIOGRAPHIC NOTES
375
NP-complete problems appearing
in Karp [1972]. Sahni and Gonzalez [1976] were the first NP-complete problem to be NP-complete itself (Exercise 13.15), while Garey and Johnson [1976] showed that coming within less than a factor of two of the chromatic number of a graph (number of "colors" needed to ensure that each vertex be colored differently from adjacent vertices) is /VP-complete (Exercise 13.19). Exercise 13.17 on improving an approximation to a maximal clique is also from Garey and Johnson [1976]. Rosenkrantz, Stearns, and Lewis [1977] studied approximations to the traveling salesman problem (Exercise 13.16). Christofides [1976] has improved on their results. J on the A number of papers have attempted to explore the structure of. \"if hypothesis that & Ladner [1975a] shows, for example, that if & ± A y/, then there are problems that are neither in & nor TVP-complete. Adleman and Manders [1977] show that certain problems have the property that they are in & if and only if. V & = coBook [1974, 1976] shows inequality among certain complexity classes, such as DTIMEf/i*) or DSPACEflog* n\ Exercise 13.39, relating sJ> = J> to AFL theory, is from Book [1970]. Berman and Hartmanis [1977] look at density-preserving reductions of one problem to another. Exercise 13.22 is from Berman [1978], and Exercise 13.20 is from Levin [1973].
to prove the approximation to an
,
t
.
\
Particular attention has been given to the complexity of recognizing primes. to
show
that the
primes are then by
in
.
nonprimes (written
in
13.8, co-.
\
'//
=
//.
\
.
nition of primes written in binary
in
is
t
Thus,
V'd? until Pratt [1975].
Theorem
easy
it
Miller [1976] gives strong evidence that the recog-
Exercise 13.23, which shows an efficient
//>.
test
from Rabin [1977]. A similar result is found Solovay and Strassen [1977]. Exercise 13.24 is from Brassard, Fortune, and Hopcroft
determining primality with high probability, in
if
It is
was not known that the the recognition of primes is NP-complete,
binary) are in .Y'f? but
is
[1978].
PSPACE-complete problems were introduced by Karp [1972], including CSL = I*" for regular expressions (Exercise 13.25a). (Theorem 13.11) and PSPACE-completeness of quantified Boolean formulas was shown by Stockmeyer [1974]. Exercise 13.25(b), PSPACE completeness of the Shannon switching game, is by Even and J \ f and Tarjan [1976], Stockmeyer [1978] gives a hierarchy of problems between PSPACE, on the assumption that /J> + PSPACE. Problems complete for J/ with respect to logarithmic space reductions have been considered by Cook [1973b], Cook and Sethi [1976], Jones [1975], Jones and Laaser [1976] (including Theorem 13.12) and Ladner [1975b] (Exercise 13.29b). Problems complete for The
first
kk
recognition
.
.
\
\
.
NSPACE(log including
n) with respect to log space reductions are considered in Savitch [1970],
Theorem
13.13 (on reachability),
Sudborough
The
first
presented by Meyer and Stockmeyer [1973].
Theorem
[1975a,b], Springsteel [1976],
and
from Jones, Lien, and Laaser [1976]. problem proved to require exponential time (in fact, exponential space) was
Jones, Lien, and Laaser [1976]. Exercise 13.30
13.15.
The lower bounds on
is
The problem
is
similar in spirit to that of
the complexity of the theory of reals with addition
(Theorem 13.16) and of Presburger arithmetic (Exercise 13.35) are from Fischer and Rabin [1974]. The upper bounds for these problems are from Cooper [1972], Ferrante, and Rackoff [1975], and Oppen [1973]. Berman [1977]; and Bruss and Meyer [1978] put what are, in a sense, more precise bounds (outside the usual time-space hierarchies) on these problems. The undecidability of Presburger arithmetic with arrays is from Suzuki and Jefferson [1977].
The
literature contains a
problems and
number of papers
their special cases, dividing
that deal with the complexity of a variety of
problems into groups, principally: polynomial,
376
INTRACTABLE PROBLEMS
/VP-complete, PSPACE-complete, and provably exponential.
A
Adleman and Manders Cardoza, Lipton, and Meyer [1976], problems about
sample of the areas
covered include Diophantine equations in
[1976],
computation
regular expressions in
in
asynchronous
Hunt [1975] (including Exercise 13.32), Hunt, Rosenkrantz, and Szymanski [1976], and Stockmeyer and Meyer [1973], problems about context-free grammars in Hunt and Rosenkrantz [1974, 1977], Hunt and Szymanski [1975, 1976], and Hunt, Szymanski, and Ullman [1975] (including Exercise 13.31), and game theory in Schaefer [1976]. The
results
of Section
13.7
and Exercise
13.38,
on the
& = jV0>
question
presence of oracles, are from Baker, Gill, and Solovay [1975]. However,
Kozen
in
the
[1978]
on the issue. Exercise 13.26 is from there. Ladner, Lynch, and Selman [1974] studied the different kinds of bounded reducibility, such as many-one, Turing, and truth tables. Another attack on the & = JfgP question has been the development of models whose deterministic and nondeterministic time-bounded versions are equivalent. The vector machines (Pratt and Stockmeyer [1976]) are the first, and other models have been proposed by Chandra and Stockmeyer [1976] and Kozen [1976]. The reader should also note the equivalence for space-bounded versions of the "auxiliary presents another viewpoint
PDA's" discussed
in
Section 14.1.
CHAPTER
14
HIGHLIGHTS OF OTHER IMPORTANT
LANGUAGE CLASSES
Numerous models and
classes oflanguages
have been introduced
in the literature.
This chapter presents a few of those that appear to be of greatest Section 14.1 discusses auxiliary
interest.
pushdown automata, which are PDA's with
two-way input and additional general purpose storage in the form of a spacebounded Turing tape. The interesting property of auxiliary PDA's is that for a fixed amount of extra storage, the deterministic and nondeterministic versions are equivalent in language-recognizing power, and the class of languages accepted by auxiliary
PDA's with
a given space
bound
is
equivalent to the class of languages
accepted by Turing machines of time complexity exponential Section 14.2
is
in that
space bound.
concerned with stack automata, which are PDA's with the
below the top symbol, but only in a read-only mode. Languages accepted by variants of the two-way stack automaton turn out
privilege of scanning the stack
to be time- or space-complexity classes.
Section 14.3
is
devoted to indexed languages, since they arise
in
a
number of
contexts and appear to be a natural generalization of the CFL's. Finally, Section 14.4 introduces
developmental systems, which attempt to model certain biological
patterns of growth.
14.1
An
AUXILIARY
PUSHDOWN AUTOMATA
S(n) auxiliary pushdown automaton
(APDA)
is
pictured in Fig. 14.1.
of 1)
a read-only input tape, surrounded by the endmarkers, $ and
2) a finite state control,
377
$,
It
consists
HIGHLIGHTS OF OTHER IMPORTANT LANGUAGE CLASSES
378
Read -only input tape
Finite
control
1
1
Stack
Auxiliary
Fig. 14.1
3) a read-write storage tape of length
PDA.
S(n\ where n
is
the length of the input
and
string w, 4) a stack.
A move
of the
APDA
is
determined by the state of the
finite control,
along
with the symbols scanned by the input, storage, and stack heads. In one move, the
APDA may 1)
change
2)
move
do any or
all
of the following:
state,
its
input head one position
or right, but not off the input,
left
symbol on the cell scanned by the storage head and move that head one position left or right,
3) print a
4)
push a symbol onto the stack or pop the top symbol
off the stack.
finite number of choices of the above end of the input and storage tapes, with the finite control in a designated initial state and the stack consisting of a designated start symbol. Acceptance is by empty stack. If the
device
is
nondeterministic,
type. Initially the tape
it
has a
heads are at the
left
Equivalence of deterministic and nondeterministic
The
A PDA's
APDA's originates from the discovery that deterministic and nonAPDA's with the same space bound are equivalent, and that S(n)
interest in
deterministic
space on an
APDA
S{n)
is
equivalent to c
time on a Turing machine. That
following three statements are equivalent. 1)
L
is
accepted by a deterministic S(n)-APDA.
2)
L
is
accepted by a nondeterministic S(n)-APDA.
3)
L
is
in
These
S( " )
DTIME(r
)
for
some constant
c.
facts are established in the following series
of lemmas.
is,
the
14.1
Lemma log
n,
14.1
If
L
L
is
Let
A
have
Proof
of length
n,
is
accepted by a nondeterministic
DTIME(cS(n)
then
in
AUXILIARY PUSHDOWN AUTOMATA
|
5 states,
there are n
+
t
for
)
some constant
5(«)-APDA A with
379
S(n)
>
c.
storage symbols, and p stack symbols. Given an input 5 possible states, S(n)
2 possible input head positions,
and
possible storage head positions,
S(n) t
possible storage tape contents, for a total
of
=
(n
+
S(n)
>
log
s(n)
possible s(n)
<
configurations^
dSin) for
all
Construct a length 1)
2)
n
>
As
2)sS(n)t n,
S{n)
there
a constant d such
is
that
1.
TM M
that performs the following operations
on input
w
of
n.
M constructs a PDA P w that on £-input simulates moves of A on input w. M converts P w to a CFG G w by the algorithm of Theorem all
5.4.
For fixed A, P w is a different PDA for each w, with the state and contents of input and storage tapes of A encoded in the state of P w N(P W ) is {e} or 4> depending on whether or not A accepts w. P w has at most s(n) < d S(n) states and p stack symbols. Therefore G w has 2S(n) at most pd + variables. As A can push only one symbol, no right side of a S{n) production of G w has more than two variables, so there are at most rd productions for any nonterminal of G w where r is the maximum number of choices that A has in any situation. Thus the test of Theorem 6.6, to tell whether L(G W ) is 2 5S{n empty, takes time proportional to rp d \ at most. Since r, p, and d are constants, can determine in time at most rS(n) whether L(G W ) there is a constant c such that is nonempty, i.e., whether w is accepted by A. .
1
,
M
Lemma one-tape
14.2
If
TM M
reaching the
L
is
DTIME(T(rc)), then
in
that traverses
first cell
ing the process, as
it
Note
tape,
L
is
accepted
in
making a complete scan
time T*(n) by a in
one direction,
has never before scanned, reversing direction and repeat-
shown
in Fig. 14.2.
Fig. 14.2
t
its
Traversal pattern of
TM
M.
that a "configuration" in the sense used here does not include the stack contents.
HIGHLIGHTS OF OTHER IMPORTANT LANGUAGE CLASSES
380
By Theorems
Proof
and
12.3
L
12.5,
TM M .M simulates M
is
accepted by a
\T2 (n)
time-bounded
marking on a second track M^s head position and the cells which has already scanned. As long as the head of travels in l x the same direction as M's head, can simulate a move of with each of its own t moves. When moves in the opposite direction, leaves the head marker, x completes its scan and simulates that move on the return pass. Thus simulates )r2( " at least one move of per pass, taking at most £}L o n + i < ^(n) moves to x one-tape
X
M
M
M
If
deterministic
L
)
for
any constant
c,
then
L is
accepted by a
S(«)-APDA.
By Lemma
Proof
)
Mv
DTIME(cS(n)
in
is
M
/
complete the simulation of 14.3
M
M
M
M
Lemma
l9
14.2,
L
accepted by a c
is
4S(n)
M
time-bounded one-tape TM S{n) — c 4 so that time is d
M
with the traversal pattern of Fig. 14.2. Define d
M
bounded. Let the triple (q, Z, t) stand for the statement that at time r,i is in S(n < scanning symbol Note that since the head motion Z, where t d \ of q independent of the data, the cell scanned at time t is easily calculated from
The heart of the construction of a deterministic S(n)-APDA A is
the recursive procedure
(q,
Z,
1)
2)
t
and only
t) if
=
q
0,
TEST
M scans some
3)
that accepts
L
of Fig. 14.3, which assigns value true to the triple
cell for
Z
and
the
first
is
the symbol in the
time at time
and there is a triple (p, enters state q after one move, or tape
is
t.
if
the start state,
is
state
M
cell,
X
y
t
—
r,
Z
is
first
tape
cell
of
M,
or
the original symbol in that
1) that is true
and implies
that
M
M previously scanned
the cell visited at time t and there are true triples (p l9 and (p 2 X 2 ?) such that the first triple implies that state q is entered after one move, and the second implies that Z was left on the tape cell is the last time the tape cell was scanned. Recall that the head motion of uniform, and thus the time t' at which the cell was last visited is easily calculated from t.
X
t
l9
-
1)
,
,
M
As
TEST
only calls
with smaller third arguments,
itself
it
eventually termin-
The S(n)-APDA A evaluates TEST by keeping the arguments on the storage tape. When TEST calls itself, A pushes the old arguments onto the stack, and when TEST returns, A pops them off the stack and puts them on the storage tape. The complete algorithm that A executes is ates.
for each triple (q, Z, S(n)
and 0 < t < d if TEST(g, Z,
Theorem 1)
L
is
t "At time
14.1
t)
The following
f)
such that q
is
an accepting state
do then accept
are equivalent for s(n)
>
accepted by a deterministic S(m)-APDA.
f
means
"after
t
moves have elapsed," so
initially,
/
=
0.
log
n.
14.2
procedure TEST(g, Z,
|
STACK AUTOMATA
381
r);
begin if
= 0, q is the initial state of M and Z is the first input symbol then return true; < < n and Z is the fth input symbol, or = in + i(i — l)/2 for some integer > and Z = B then t
if 1
t
r
each state p and symbol
for
if
X
i
1
do
M enters state q when scanning X
in state p,
and TEST(p, X,
t
—
1)
then return
true;
+
/* the times in if t
>
and
n
t
i(i
j= in
— l)/2 + i(i —
are exactly the times l)/2 for
any integer
i
when
>
1
M scans a new
cell */
then
begin let
M
scanned the same cell as at time f; and p 2 and symbols Xi and X 2 do enters state q when scanning X in state p, and writes Z when scanning X 2 in state p 2 and TEST(p!, X u t — 1) and TEST(p 2 X 2 t') then return true be the previous time
t'
for all states p x
M
if
M
x
,
,
end; return false
end
The procedure TEST.
Fig. 14.3
2) 3)
L L
is
accepted by a nondeterministic 5(n)-APDA.
is
in
lS( "
DTIME(c
Proof That and Lemma Corollary
14.2
The
)
for
)
(1) implies (2)
is
some constant obvious.
c.
Lemma
14.1 established that (2) implies (3)
14.3 established that (3) implies (1).
L
is
in
&
if
and only
if
L
is
accepted by a log
n-APDA.
STACK AUTOMATA stack automaton (SA)
1)
The input
2)
The
is
is
a
PDA
with the following two additional features.
two-way, read-only with endmarkers.
pop moves
stack head, in addition to push and
top of the stack can
at the
enter the stack in read-only mode, traveling up and
down
the stack without
rewriting any symbol.
A
stack
automaton
A move
of an
is
in Fig. 14.4, in
read-only mode.
determined by the
state, the input,
shown
SA
is
scanned, and whether or not the top of the stack head. In either case,
in
move one position left move may also include the stack head 1)
is
one move the
state
is
and stack symbols
being scanned by the stack
may change and
the input head
may
head is not at the top of the stack, a a stack head motion, one position up or down the stack. If
or right.
If the stack
at the top, the permissible stack actions are:
push a symbol onto the stack,
HIGHLIGHTS OF OTHER IMPORTANT LANGUAGE CLASSES
382
\v
Ai
Top
of stack
Finite
control
Fig. 14.4
2)
pop
3)
move one
the top
In actions (1)
symbol
position
and
A
stack automaton.
off the stack, or
down
the stack without pushing or popping.
(2) the stack
head stays at the top; in action (3) it leaves the top mode, which it may leave only by returning to
of stack and enters the read-only the top of the stack.
input head is at the left end, the finite control is in a designated and the stack consists of a single designated start symbol. Acceptance
Initially, the initial state, is
by
final state.
never more than one move in any situation, the device is deterDSA); if there is a finite number of choices of moves in any situation, the automaton is nondeterministic (an NSA). If the device never pops a symbol it is nonerasing (an NEDSA or NENSA). If the input head never moves left, the stack automaton is one-way (a 1DSA, 1 NENSA, and so on). In the absence of any statement to the contrary, we shall assume an SA is two-way, deterministic, and If there is
ministic (a
permits erasing.
Example 14.1 Let L = {0"\ n 2"\n > 1}. We design an SA to accept L as follows. The input head moves right at each move. While O's are encountered, they are pushed onto the stack above the bottom marker (start symbol) Z 0 The stack head remains at the top of stack in read-write mode. Fig. 14.5(a) shows the situation after reading O's. On seeing the first 1, the stack head moves down, entering the read-only mode. As successive l's are read, the stack head moves one position down for each 1 (but if the first 2 is not seen at the same time the stack head reaches the bottom marker, there is no next move, and the SA does not accept). The situation in which the SA then finds itself is shown in Fig. 14.5(b). As 2's are scanned on the input, the stack head moves up one position for each 2. A move to an accepting state is permissible only when the stack head is at the top and $ is scanned on the input, as in Fig. 14.5(c). Of course, the state from which this move can be made is only entered after we have seen 2's, so we cannot accept inputs like .
or $00$.
Note ing.
that the
SA we have
described
is
one-way, deterministic, and noneras-
1
14.2
000
1
1
1
^0001
2 2 2 $
1 r
:
0
o
0
STACK AUTOMATA
|
^000111222$
1222S
1
' r
z
0
383
r
0
o
0
0
z
(b)
(a)
0
0
o
0
(c)
ID's of a stack automaton.
Fig. 14.5
Transition tables
we give proof sketches for a number of the fundamental results characterizing the languages accepted by the varieties of SA. One of the central ideas is the simulation of stack automata by other devices by In the remainder of the section
means of a
transition table,
which
is
a succinct representation of a stack (actually
the stack except for the top symbol). Suppose a deterministic stack
automaton
is
q with the input head at position i and the stack head at the next-to-top symbol. Then the SA is in read-only mode and the stack cannot change until the in state
For a particular sequence of stack symbols, the stack it will first reach there in some state p with input in position j. For each q and i the transition table tells whether the stack head ever moves to the top and if so gives the state p and input head position j when the top is reached. Thus the transition table completely characterizes the effect of the sequence of stack symbols below the top, provided acceptance does not occur stack head reaches the top.
head head
may
never reach the top, or
when the stack head is inside the stack. The number of distinct transition tables (excluding endmarkers) and with s states
is
for
an
thus [s(n
SA
+
2)
with input of length n
+
l]
s(n+2 >.
With input some
positions encoded in binary, a transition table requires only cn log n bits for
constant c that depends on the If
the
SA
is
give the set of
number of states of
the given SA.
nondeterministic, then for each q and
(p, j)
pairs such that started in state
q,
i,
must and the
the transition table
with input position
/,
stack head at the next-to-top stack symbol, the top of stack can be reached in state p and input position j. The number of possible transition tables for an s state NSA 2 s(M+2)+ s(n+2) with input of length n is [2 < 2 C " so such a transition table can ] 2 be encoded in cn bits, where c depends only on the number of states of the NSA. ,
Characterization of stack languages by time
and space complexity classes
We shall
show
that a deterministic
conversely that an n log
equivalence of
DSA
m-APDA
SA can
be simulated by an n log
can be simulated by a
DSA
n-APDA and
establishing the
and n log rc-APDA. In a similar manner we
establish the
HIGHLIGHTS OF OTHER IMPORTANT LANGUAGE CLASSES
384
NSA and « 2 -APDA. alence of NEDSA and DSPACE(n NSPACE(« 2 A series of lemmas is
For nonerasing SA we
equivalence of
).
Lemma
NENSA
and
used.
Each type of stack automaton
14.4
establish the equiv-
log n) and the equivalence of
is
equivalent to one of the same type
that accepts only at the top of stack.
Proof We modify a given SA so that in any accepting head up the stack until the top is reached.
Lemma
14.5
If
Given an
Proof
L
NEDSA A
a Turing machine
M
below the top symbol. The
then
L
is
A by
keeping track of
and the transition table
and
We need
i).
for
X X2 X
For each
• Xm .
state q
x
we
,4's state,
its
log
stack
n).
construct
input head
for the portion of the stack
initial transition table is the table
string ("undefined" for all q
7
moves
DSPACE(«
in
associated with the
only explain
the transition table T' associated with the stack string table
it
that accepts only at the top of the stack,
that simulates
position, top stack symbol,
empty
NEDSA,
accepted by an
is
state
X X1
how •••
1
to construct
Xm
given the
.
and input position U execute the algorithm of
algorithm keeps track of the sequence of state-input-position pairs
Fig. 14.6. (p, j) in
The
which
X m „ u T is consulted to which X m will be scanned if any. The variable COUNT checks that the length of the sequence of (/?, y')'s does not exceed the product of s, the number of states, and n + 2, the number of input positions. If Xm
is
scanned. Each time the stack head moves to
determine the next state-position pair
so,
A
is
in
surely in a loop, so that value of
T(q
y
i) is
"undefined."
begin
COUNT:-
0;
(M -to*); while
COUNT
<
s(n
+
2)
do
begin
COUNT := COUNT suppose A state
r,
tion
D;
in state p,
moves
4-
1
scanning stack symbol
Xm
,
at input position ; enters
the input head to position k and the stack head in direc-
D — "up" then return (r, k); if D — "stationary" then (p, j) = if D = "down" then if T(r, k) = "undefined" then if
:
(r,
k);
return "undefined"
else (p, ./):= 7(r, k)
end return "undefined"
end Fig. 14.6
Algorithm to compute transition table
for
NEDSA.
14.2
Note
that for given
STACK AUTOMATA
|
385
the algorithm of Fig. 14.6 requires only 0(log n)
(q, i)
space to hold the value of COUNT. Thus T'can be computed from
Tand
M
X m in
and T\ which is n log n. The TM has only to simulate A directly when the stack head is at the top of the stack, consult the current transition table when the stack head leaves the top, and compute a new transition table (throwing away the old) when a stack symbol is pushed. As stack symbols are never erased, we need not preserve the stack. the space
Lemma
it
takes to store
14.6
If
L
is
T
accepted by a
NENSA,
then
L
is
in
NSPACE(« 2
).
2 Proof The proof is similar to that of the previous lemma, save that n space is needed to store the transition matrix, and the simulation must be nondeter-
ministic.
Lemma
14.7
If
L
accepted by
is
Proof
The proof is again
(which
it
may
not enter
DSA,
then
L is
accepted by a n log
similar to that of Lemma 14.5.
in
The
n-APDA.
APDA uses its stack
read-only mode, of course) to hold the stack of the
stack symbol the APDA stores a transition table. The above a particular stack symbol corresponds to the entire stack, up to and including that symbol. The topmost stack symbol and the table for the stack below it are placed on the storage tape. When the DSA pushes a symbol, the APDA pushes the table that is on its storage tape along with the old top stack symbol onto its own stack, and computes the new table as in Lemma 14.5. When the DSA pops a symbol, the APDA discards the top stack symbol and then moves
DSA. Between each DSA
transition table
the top table to
Lemma
14.8
If
its
L
storage tape. is
accepted by an
NSA,
then
L
is
accepted by an
2 /7
-APDA.
The proof is a combination of the ideas introduced in Lemmas 14.6 and Note that by Theorem 14.1 the APDA may be made deterministic.
Proof 14.7.
We now The key
turn to the simulation of space-bounded devices by stack automata.
idea here
is
"blocks" of symbols
that the
down
SA can
its
stack.
use
A
its
input of length n to count n symbols or
sequence of ID's representing a computa-
is constructed on the stack by successively copying making changes represented by one move of the
tion of a space-bounded device
the top
ID onto
the stack,
space-bounded device. The ability to count down n symbols or "blocks" of symbols allows the SA to copy the current ID onto the top, symbol by symbol.
As a simple introduction consider the simulation of a deterministic linear bounded automaton by an NEDSA A. Given input w = a •• a n A pushes
M
"' a n^ [q 0 ai]a2
x
,
where q 0 is the start state and # is a special symbol separating ID's. The state is combined with the symbol scanned, so an ID is always exactly n symbols long. Suppose A has constructed a stack that is a sequence of ID's, including the first i symbols of the next ID: onto
i
ts
stack,
fr
X X2 j
"^n"^"
^ ^ 1
2
^
i
*
HIGHLIGHTS OF OTHER IMPORTANT LANGUAGE CLASSES
386
(Actually one or
two of the X's
corresponding symbols
in the
in the
ID
ID
may differ from the move made by M). Starting at
being constructed
below, due to the
left end of the input, A repeatedly moves one position right on the input and one position down the stack, until the right endmarker is reached on the input. At this point yfs stack head will be n + 1 symbols from the top of the stack, scanning X i+ j of the last complete ID. A looks one symbol above and below X i+ j to see if X i+ j changes in the next ID because of the move made by M. A then moves to the top of the stack and pushes either X i+ x or the symbol replacing X i+ x in the next ID due to the move of M. A accepts if and only if enters an accepting state. Actually a stack automaton can simulate devices with ID's of length greater than n by more clever use of the input. In particular, a DSA can manipulate ID's 2 of length n log n, and an NSA can manipulate ID's of length n The nondeter-
the
M
.
ministic case
Lemma Proof
is
14.9
easier,
If
Since n
L
2 is
is
we
so
in
present
NSPACE(n 2
greater than n
rather than an off-line
it
),
first.
then
L
is
accepted by a
we may assume L
TM. An ID
of length n
2 is
is
NENSA.
accepted by a one-tape
symbols, combining the state with the symbol scanned by the tape head. * is inserted after
every n symbols.
TM
represented by listing the tape
The n symbols between
*'s
A marker
make up
Successive ID's are placed on the stack as in the description of the
a block.
LB A
above.
and symbols of the (j + l)rcth block have been copied. The input tape is used to measure n *'s down the stack to the (j + l)nth block of the previous ID. A position k 1 < k < n, in the (j + l)nth block is guessed. Checking one symbol above and below determines if the symbol is affected by a move of Suppose
j blocks
i
y
the
TM.
If so,
a
move
is
guessed, provided a
move
for this
ID has not been
guessed
symbol is recorded in the state of the SA. The input tape is then used to record k by alternately moving the input head one symbol left (starting at the right end) and the stack head one symbol down until a * is encountered. Next the stack head moves to the top of the stack and compares k with the number of symbols of the jth block already copied. If A' j= i + 1, this sequence of choices "dies." If k = + 1, then the next symbol of the new ID is placed on top of the stack. The input is then used to determine if i + 1 = n. If so a * is printed, and then it is checked whether j + 1 = n. In the case j + 1 = n, a # is placed on the stack marking the end of an ID. Acceptance occurs if the symbol copied includes a final state. Otherwise the next symbol is copied. previously; otherwise the
i,
i
A
is that once a move is guessed in copying an ID, changed on copying a subsequent symbol in that ID. Other-
small but important point
the guess cannot be
wise an invalid successor
ID may be
constructed.
Theorem 14.2 The family of languages accepted by nondeterministic, nonerasing 2 stack automata is exactly NSPACE(n ). Proof
Immediate from
Lemmas
14.6
and
14.9.
14.2
STACK AUTOMATA
|
387
Theorem 14.3 The family of languages accepted by nondeterministic stack auto2 mata is exactly (J c>0 DTIME(c" ). Proof By Theorem 14.1, an rc 2 -APDA. By Lemma 2
deterministic
rc
L is in
(J c>0
14.8, if
L
-APDA. Thus
NSA
can be simulated by an
We assume that the
it
DTIME(c"
2 )
suffices to
show
and only
if
NSA
accepted by an
is
then
if
L
L
is
accepted by
is
accepted by a
that a deterministic n
2
-APDA A
S.
input of
A
2
a read-only input, since n exceeds
kept on the storage tape of A rather than on
is
of S will hold the stack of A as well
The stack
n.
as a sequence of ID's representing the storage tape of A.
contents of A's storage tape on top of its stack, and
A
Suppose S has the current
pushes a symbol. S guesses
A when that symbol is popped and places its guess on top of Then S pushes the symbol pushed by A and creates the new current tape contents of A from the old, as in Lemma 14.9. The guessed ID intervening is ignored while running up and down the stack; its symbols can be chosen from a the tape contents of the stack.
separate alphabet, so S can skip over
A pops a symbol, S checks that
it.
ID below
that symbol is correct; pop move. The current ID of A held on top of S's stack is popped one symbol at a time, and each symbol popped is compared with the corresponding symbol of the guessed ID by a method similar to that of Lemma 14.9. If the guess is correct, the guessed ID becomes the current storage tape content of A, and the simulation of A proceeds; if not, this sequence of choices by S "dies." S accepts if and only if A empties its If
that
is,
ID
the guessed
is
the guessed
the storage tape of
A
after the
stack.
L is accepted by an NSA
Corollary
if
and only
Proof The "only if" portion was established immediately from Theorems 14.1 and 14.3. In the deterministic case the function n
of
Theorems
Lemma
and
14.3.
The reason
2
is
L is
in
accepted by an h
Lemma
The
14.8.
replaced by n log n
for this
is
1 .
NSA made an essential use of its nondeterminism A DSA is able only to copy ID's of length n log n.
14.10
If
L
is
in
DSPACE(rc
log
n),
then
L
Proof
TM M
Let
X
2
-APDA.
"if" follows
in
the analogs
that in the construction of
14.9 the
of length n
Lemma
14.2
if
is
in
copying ID's
accepted by an
NEDSA.
that uses exactly n log n cells. Let L be accepted by some one-tape or [qX\ where X is a tape symbol be the number of symbols of the form fin base t + 1. and q a state. These symbols are identified with the digits 1,2, are encoded as blocks of between 0 and Strings of [\ogt+1 (n)\ such symbols of t
.
.
.
M
(n
—
1) 0's.
There
is
an integer
c,
,
M
may depending only on M, such that an ID of 0's, each block coding [\og t+l (n)\ symbols,
be represented by cn blocks of
provided n
>
t.
Design a stack automaton S to construct a sequence of ID's of M, each ID
HIGHLIGHTS OF OTHER IMPORTANT LANGUAGE CLASSES
388
being a sequence of cn blocks of between 0 and (n
—
1) O's
separated by markers,
Blocks are copied to the top of the stack by using the input to count cn
*'s
*.
down
on the input, moving to and pushing an equal number of O's onto the stack. Before a new block is placed on the stack, it is necessary to determine which, if any, symbols change. To do so, decode 0* by repeatedly dividing by t -h 1, the successive remainders being the successive symbols of the ID. The division is accomplished by measuring k on the input and then moving the input back to the endmarker, placing an X on the stack for every t 4- 1 positions the input head moves. The X's are not part of an ID. The finite control computes k mod (t H- 1), and the resulting digit is placed above the X s. The block of X's is then measured on the input and the process repeated until the block of X's has length zero. The digits written on the stack between blocks of X's are the desired block of the ID. S checks whether the head is scanning a symbol in the block and also notes if the head moves into an adjacent block. The blocks are re-encoded into strings ofO to (n — 1) O's, making the necessary changes to reflect the move of M. The process of re-encoding is the reverse of that just described. Note that since S is nonerasing, it never gets rid of the X s or digits on its stack; they are simply ignored in subsequent computation. Also, before copying a block, S must decode the block moves left into the present block. above, to see whether the head of S initializes its stack by coding its own input as an ID of M. The details of this enters an accepting state. process are omitted. S accepts if it discovers that the stack, measuring the length of the block to be copied
the top of stack
y
y
M
M
Theorem and only
L
14.4 if
L
is
DSPACE(n
From Lemmas
Proof
Theorem
L
14.5
is
m Uoo DTIME(h Note
Proof only
if
DSA, that
DSA
accepted by a deterministic nonerasing stack automaton
is
in
L
is
then
if
L
is
c
14.5
log
and
if
n).
14.10.
accepted by a deterministic stack automaton
if
and only
if
Lis
").
=
CM,og
By Theorem 14.1, L is in |J C>0 DTIME(rc c") if and h-APDA. By Lemma 14.7, if L is accepted by a accepted by a deterministic n log rc-APDA. Thus it suffices to show
that n
cn
2
accepted by an n log
L is
accepted by a deterministic n log
n-APDA
A, then
L is
accepted by a
Again we assume that /Ts input tape is combined with its storage tape. The proof parallels Theorem 14.3, using the techniques of Lemma 14.10 to represent storage tapes of A and simulate moves of A. However, when A pushes a symbol X onto its stack, S, being deterministic, cannot guess the storage tape contents of A when A eventually pops that X. Instead S cycles through all possible 5.
ID's systematically. If
it
has
made
the
wrong
ID and restarts the simulation of A A empties its stack to accept assures
choice,
that
a chance to generate the correct choice.
that
if
it
X
generates the next possible
was pushed by A. The fact A accepts, 5 will eventually get
from the time
X 14.3
L
Corollary
accepted by a
is
DSA
if
INDEXED LANGUAGES
|
and only
if
L
is
389
accepted by an n log n-
APDA. The "only
Proof
Lemma
portion was established in
if"
immediately from Theorems
and
14.1
14.7.
The
"if" follows
14.5.
One-way
stack automata are not powerful enough to simulate tape-bounded However, there is one important containment relation, which we state
devices.
without proof.
Theorem
14.6
L
If
Of the many of contexts.
We
give a
languages are cited
An
grammars
generalizations of context-free
most
class called "indexed" appears the
grammar
DSPACE(rc).
that have been proposed, a
it arises in a wide variety Other definitions of the indexed
natural, in that
definition here.
bibliographic notes.
in the
grammar
indexed
T the set of terminals, / 1)
in
is
INDEXED LANGUAGES
14.3
set
L
accepted by a 1NSA, then
is
/, P, S), where V is the set of variables, Sin Vis the start symbol, and P is a finite
a 5-tuple (F, T,
is
the set of indices,
of productions of the forms
A-xx,
A-^Bf
2)
where A and B are
in
K,/is
or
and a
in /,
Derivations in an indexed
3)
grammar
in
is
Af-*
(V
a,
u
7)*.
are similar to those in a
CFG except that
may be followed by strings of indices. (Terminals may not be followed by indices.) When a production such as A BC is applied, the string of indices for A is attached to both B and C. This feature enables many parts of a sentential form to be related to each other by sharing a common index string.
variables
Formally, we define the relation => on sentential forms, which are strings
(VI*
u
1) If
T)*, as follows. Let
A
->
X X2
X
•••
x
/?
and
y
be
in
(VI*
u
7)*, 5 be in /*,
is
=
5
if
X
{
is
in
X
i
in
Vu
in
T.
a production of type (1) then
k is
PASy^pX.S.X^'-' X k S k y where
and
Kand
3^
=
c if
X
{
is
in 7.
When
y
a production of type (1)
applied, the string of indices S distributes over all the variables
on the
right
side.
2) If
A
first
3) If
->
B/
is
a production of type
(2),
then fiA5y=> fiBfSy. Here /becomes the
index on the string following variable B, which replaces A.
Af-+
X X2 x
•"
X
k is
a production of type
(3),
PAfSy^pX.S.X^---
then k
S k y,
= 5 if X,- is in V and <>, = c if X, is in T. The first index on the list for where A is consumed, and the remaining indices distribute over variables as in (1).
HIGHLIGHTS OF OTHER IMPORTANT LANGUAGE CLASSES
390
We let to be {w
Example
|
^> be the reflexive and transitive closure of => as usual, and define L(G)
S ^>
w and w Let
14.2
G=
({5, T,
A, B, C},
b
{a,
y
P, S), where
c}, {/,
Af-+ aA,
Ag
T-+Tfi
Bf-+bB,
Bg^b,
T -> ABC,
Cf-+cC,
Cg->c.
S
An example
in T*}.
is
-* Tg,
derivation in this indexed
grammar
P consists
of
-* a,
is
S^Tg^Tfg^AfgBfgCfg =>
aAgBfgCfg => aaBfgCfg => aabBgCfg
=> aabbCfg =>
aabbcCg => aabbcc.
In general,
S !> Tf'g =>
AfgBfgCfg - a i+ >b i+l ci+ \
As the only freedom in derivations of G consists of trivial variations in order of replacement and the choice of how many times to apply T -> Tf, it should be clear that
This language
We
is
not context
state without
Theorem
14.7
ton, then
L
is
(a) If
L(G)
=
free,
of course.
n
{a b
n
c"\n>
1}.
proof two major results about indexed languages.
L is
accepted by a one-way nondeterministic stack automa-
an indexed language, (b)
If
L
is
an indexed language, then
L
is
a
context-sensitive language.
In
fact, (a)
can be strengthened by defining a generalization of an SA, called a
whose one-way nondeterministic variety exactly charThe nested SA has the capability, when the stack head is inside its stack in read-only mode, to create a new stack. However, this stack must be destroyed before the stack head can move up in the original stack. The process of creating new stacks is recursive and allows the creation of new "nested stack automaton,"
acterizes the indexed languages.
stacks to an arbitrary depth.
14.4
The
DEVELOPMENTAL SYSTEMS grammars to the study of growth in cellular organisms ingrammar families called /^systems. These grammar families differ Chomsky grammars in that
application of
troduced new
from the 1)
no
distinction between terminals
and nonterminals
is
made, and
14.4
DEVELOPMENTAL SYSTEMS
|
391
is applied to each symbol of a one symbol or a short substring.
2) at each step in a derivation, a production
sentential form, rather than to just
The modeling of organisms by {^systems allows mechanisms behind
ing the
the testing of hypotheses concern-
certain observable biological
phenomena. Here we
content ourselves with defining only the most basic family of these grammars, called OZy-systems. (The 0 stands for zero symbols of context; the
who first
Arvid Lindenmeyer,
A OZ^grammar vocabulary, a
a triple
is
a string in
is
of the form a ->
/?,
G=
Z+
where a
is
if
a -» {
P
a, is in
for
< <
1
i
n.
symbol. The substitution
The
the same.
where
(£, P, a),
E
called the start string,
£ and
in
Note {
.
I*.
in
($ is
a l a 2 "' a n =>0t
us to avoid substituting for a
l
2
is
a
is
finite
P
and
The
is
organisms.)
in
alphabet called the
a set of productions
relation =>
defined by
is
an
that a -> a might be a production, permitting {
t
Otherwise, a substitution must be
for different occurrences of the
relation
L acknowledges
used these grammars to study growth
made
for
each
same symbol need not be
the reflexive, transitive closure of =>, and L(G)
is
defined to be {P\ol^>P}.
Example
G=
Let
14.3
case, there
({a, b},
(infinite length) derivation,
tion.
The
P, a),
where P consists of a
derivation
and every word
in
-> ab. In this
really only
one
=>.
words in L(G) are exactly the Fibonacci numbers defined and / = +/_ 2 for i > 3. One can prove by induction on / > 3 word in the derivation has ^s and 2 as, a total of)\ symbols.
Note
that the length of
= f2 =
that the /th
1
,
,
14.4
G=
is
is
by/,
Example
and b
the language appears in that deriva-
«=>/?=>«/?=> bab => abbab => bababbab
where
-+ b
only one production for each symbol, so there
is
({a},
The language {a, aa) is not a OL-language. Suppose L(G) = {a, aa}, F, a). Then a must be a or aa. Now all productions are of the form
Suppose a = a. Surely there cannot be a production a -> a\ there must be a production a -> aa, else aa could never be genfor i 3. erated. But then a=>aa=> aaaa, a contradiction. Suppose next that a = aa. There must be a production a-> c, else all strings in L(G) are of length two or more. But then aa => c, so L(G) =f= {a, aa} again. a ->
for
a*
>
A
Proof
>
i
0.
basic result about (^languages
Theorem
Z, {/
some Then
14.8
Let ^},
P2
If
G = x
,
5),
L (Z,
is
is
the following.
a OZ^language, then
Pu
cc)
L
is
an indexed language.
be a C\L-grammar. Define indexed
where
K=
{5,
T}
u
{/l fl |fl is in
£},
grammar G 2 = (K
HIGHLIGHTS OF OTHER IMPORTANT LANGUAGE CLASSES
392
P2
and
contains
S^Tg, T-+Tf,
T-+A ai A a2 ~-A ak AJ-+ A bl A b2 A a g ->
a
• -
if
A bj
-
=
ct
for each
Informally the string of /'s counts the
it
An
represents.
A bl f
S ^> Tfg ^>
G2
if
and only
a ^> b x b 2
if
'
S^>/l bl if
and only
if
1
b2
Pu
b} in
Gt^>b b 2
•••
bk
{
number of steps
string,
a derivation of
in
G
l9
allowing a variable to be replaced by
easy induction on the length of a derivation shows
that
in
a-+b
production
for each a in £.
and index g marks the end of an index the terminal
'-aky
a 1 a2
'
'
l
-j
gA b2 f^g
A bk f^g
• •
bk by a derivation of / steps
^
A bk g*>b b 2
^
2
•••
l
fc
G Thus
in
x
.
k
.
SUMMARY
14.5
Figure 14.7 shows the various equivalences and containments proved or stated
some others that by upward edges.
this chapter, plus
are indicated
in
are immediate from definitions. Containments
EXERCISES 14.1
DSA to b) Design a one-way NSA to *14.2 Design a two-way DSA Design a one-way
a)
integer,
**14.3
is
a
power of
Since every
h-APDA. Give
log
Show
14.4
form
full
14.5
{OT
recognize the language
{ww|w
\n
>
to accept the set of binary strings
1}.
+
in (0
is
whose
1)*}.
value, treated as an
3.
CFL can
Theorem
corollary to
2
recognize the language
14.1
be recognized
in
polynomial time by the
implies that every
CFL
a direct construction of such an
that the family of
1NSA
is
CYK algorithm, the some
recognized by
APDA
from a
languages and the family of
1NENSA
Show
that the families of
1DSA
languages and
a) intersection with a regular set,
b) inverse
**c) complementation,
1NEDSA
languages are closed
GSM
mappings,
**d) quotient with a regular
Give indexed grammars generating the following languages.
Sa) {0" n
is
a perfect square}
b) {0" n
c) {0"|n
is
not a prime}
d)
|
languages
AFL's.
under:
14.6
deterministic
CFG.
|
is
{ww|w
a power of 2} is
in (0
+
1)*}
set.
EXERCISES
Fig. 14.7
Containments among
393
classes of languages.
Give OL-grammars generating the following languages.
14.7 a) {a
n
*S14.8 perfect
|
n
is
a power of 2}
b)
{wcw R
\
w
is
in (0 4- 1)*}
Give a OJ^grammar with the property that every string generated is of length a square and furthermore for every perfect square there is at least one string of that
length generated. *14.9
**14.10
Of the
eight subsets of
Show
that
the
{c, a>
aa}>
how many
family of OJ^languages
is
are OZ^languages?
not closed under any of the
AFL
operations. **14.1
1
Show
that
Show
that Greibach's
it
is
decidable whether the language generated by an indexed
grammar
is
empty. *14.12
and that **14.13
"= £*"
Show
that
is
it
theorem (Theorem 8.14) applies to the
undecidable for this is
1NEDSA
languages,
class.
undecidable whether two OL-languages are equivalent.
Solutions to Selected Exercises 14.6 a) integers.
We make use of the fact that the nth perfect square The indexed grammar with productions
S-Ag A-+ Af
A-+B
is
the
sum
of the
first
n odd
394
HIGHLIGHTS OF OTHER IMPORTANT LANGUAGE CLASSES
B-+CD Df -+ B
Dg-*e C/-+OOC
Cg-*0 generates {0
n |
n
is
The
a perfect square}.
derivations are trivial variations of the following
derivation.
S=>Ag±>Af n =>
Cf-
l
l
^Cf"n-
^Cf ~Cf n
x
l
n-
-
g=>Bf n
l
-
g n~
l
gDf»- g => Cf g Cf
n- 2
gCf
gCf
n- 2 n- 2
n-
l
l
gBf
n
-2
g
2
g Df»- g n gBf -"g=>->
g-CfgCgDg
g-CfgCg *>V»- Cf- g-Cf9 C9 *>=>Cf
l
gCf
x
l
|> Q2„-l 0 2„-3 14.8
We
integers.
tion
again
make
...
b
({a,
y
c},
that the nth string generated has
length of the
>ith
Q3Q1
=
Qn2
use of the fact that the nth perfect square
Consider the OL-grammar
shows
l
string
is
1
4-
2(n
-
1)
+
(n
is
the
{a -* abbc\ b -> be, c -*
one
-
l)
a, 2
2(n
or n
—
I)
b
y
sy
and
sum
of the
c}, a).
A
—
I)
(n
first
n odd
simple induc2 c's.
Thus
the
2 .
BIBLIOGRAPHIC NOTES The auxiliary pushdown automaton and Theorem 14.1 are from Cook [1971a]. Earlier, Mager [1969] had considered "writing pushdown acceptors," which are n-APDA's. Stack automata were first considered by Ginsburg, Greibach, and Harrison [1967a, b]. Theorems 14.2 and 14.4, relating nonerasing stack automata to space complexity classes, are from Hopcroft and Ullman [1967a], although the fact that the CSL's are contained in the NEDSA languages was known from Ginsburg, Greibach, and Harrison [1967a]. Theorems 14.3
and
14.5, relating stack
languages to
APDA's and
time complexity classes, are by
Cook
[1971a].
The
basic closure
and decision properties of one-way stack languages were treated
Ginsburg, Greibach, and Harrison [1967b]. Exercise 14.5(d), the closure of
1DSA
in
languages
by Hopcroft and Ullman [1968b]. Theorem 14.6, DSPACE(n) is by Hopcroft and Ullman [1968c]. Ogden [1969] gives a "pumping lemma" for one-way stack languages. Beeri [1975] shows that two-way SA's are equivalent to two-way nested stack automata. Indexed grammars were first studied by Aho [1968]. Theorem 14.7(b), the containment under quotient with a regular
set, is
containment of the
1NSA
within the CSL's,
from there, as
is
languages
in
in Exercise 14.1
1,
decidability of emptiness.
other characterizations of the indexed languages are known. nested stack automata, an
Aho
A
variety of
[1969] discusses one-way
automaton characterization. Fischer [1968]
discusses
macro
BIBLIOGRAPHIC NOTES
395
—
grammars, Greibach [1970] provides another automaton characterization a device with a Maibaum [1974] presents an algebraic characterization. These alterna-
stack of stacks, and tive
formulations lend credence to the idea that the indexed languages are a "natural" class.
Hayashi [1975] gives a "pumping lemma" for indexed languages. /-systems originated with Lindenmayer [1968], and the Ol^systems, on which we have concentrated, were considered by Lindenmayer [1971]. Exercise 14.10, on nonclosure properties of these languages,
is
from Herman [1974]. Exercise
equivalence of 01/-languages,
is
implied by a stronger result of Blattner [1973], that
14.13, the undecidability of it is
undecidable whether the sets of sentential forms generated by two CFG's are the same.
Much
has been written on the subject, and the interested reader
[1973] and
Herman and Rozenberg
is
referred to
Salomaa
[1975].
We that
have but touched on some of the multitude of species of automata and grammars have been studied. Rosenkrantz [1969] is representative of another early step in this
direction,
and Salomaa [1973] covers a
variety of classes not touched
upon
here.
BIBLIOGRAPHY
Aanderaa,
S.
O. [1974].
"On
fc-tape versus (k
of Computation (R. M. Karp,
ed.).
-
l)-tape real time computation," Complexity
Proceedings of
SIAM-AMS Symposium
in
Applied
Mathematics.
and K. Manders [1976]. "Diophantine complexity," Proc. Seventeenth Annual on Foundations of Computer Science, pp. 81-88. Adleman, L., and K. Manders [1977]. "Reducibility, randomness and intractability," Proc. Ninth Annual ACM Symposium on the Theory of Computing, pp. 151-163. Aho, A. V. [1968]. 'indexed grammars an extension of context-free grammars," J. ACM
Adleman,
L.,
IEEE Symposium
—
15: 4,
647-671.
Aho, A. V. [1969]. "Nested stack automata," J. ACM 16: 3, 383-406. V., and M. J. Corasick [1975]. "Efficient string matching: an aid to bibliographic
Aho, A.
ACM
search,"
Comm.
Aho, A.
V., J. E.
down automaton Aho, A.
18: 6,
333-340.
Hopcroft, and Hopcroft, and
V., J. E.
J.
D. Ullman [1968]. "Time and tape complexity of push-
languages," Information and Control 13: J.
3,
186-206.
D. Ullman [1974]. The Design and Analysis of Computer
Algorithms, Addison- Wesley, Reading, Mass.
Johnson [1974]. "LR parsing," Computing Surveys, 6: 2, 99-124. of [ 970]. "A characterization of two-way deterministic classes languages," J. Computer and Systems Sciences 4: 6, 523-538. Aho, A. V., and J. D. Ullman [1972]. The Theory of Parsing, Translation and Compiling, Vol. Aho, A. Aho, A.
I:
and
S.
J.
C.
D. Ullman
Parsing, Prentice Hall,
Aho, A. II:
and
V.,
V.,
V.,
and
J.
Englewood
V.,
and
Cliffs, N.J.
D. Ullman [1973]. The Theory of Parsing, Translation and Compiling, Vol.
Compiling, Prentice Hall,
Aho, A.
1
J.
Englewood
Cliffs, N.J.
D. Ullman [1977]. Principles of Compiler Design, Addison-Wesley,
Reading, Mass. Arbib,
M.
A. [1970]. Theories of Abstract Automata, Prentice Hall,
396
Englewood
Cliffs, N.J.
BIBLIOGRAPHY
397
Arden, D. N. [I960]. "Delayed logic and
finite state machines," Theory of Computing Michigan Press, Ann Arbor, Mich. "On a subrecursive hierarchy and primitive recursive degrees," Trans. AMS
Machine Design, pp. Axt, P. [1959].
1-35, Univ. of
92, 85-105.
W.
[1959].
"The syntax and semantics of the proposed international algebraic
language of the Zurich
ACM-GAMM conference," Proc. Intl. Conf. on Information Process-
Backus,
ing, pp.
J.
125-132,
UNESCO.
S., and R. V. Book [1974]. "Reversal bounded multipushdown machines," J. Computer and Systems Sciences 8: 3, 315-332. Baker, T., J. Gill, and R. Solovay [1975]. "Relativizations of the P = ?NP question," SI AM J. Computing 4: 4, 431-442. Bar-Hillel, Y., M. Perles, and E. Shamir [1961]. "On formal properties of simple phrase structure grammars," Z. Phonetik. Sprachwiss. Kommunikationsforsch. 14, 143-172. Bauer, M., D. Brand, M. J. Fischer, A. R. Meyer, and M. S. Paterson [1973]. "A note on disjunctive form tautologies," SIGACT News 5: 2, 17-20. Beeri, C. [1975]. "Two-way nested stack automata are equivalent to two-way stack automata," J. Computer and Systems Sciences 10: 3, 317-339. Beeri, C. [1976]. "An improvement on Valiant's decision procedure for equivalence of deterministic finite-turn pushdown automata," Theoretical Computer Science 3: 3, 305-320. Berman, L. [1977]. "Precise bounds for Presburger arithmetic and the reals with addition," Proc. Eighteenth Annual IEEE Symposium on Foundations of Computer Science, pp. 95-99. Berman, L., and J. Hartmanis [1977]. "On isomorphisms and density of NP and other complete sets," SI AM J. Computing 6: 2, 305-322. Berman, P. [1978]. "Relationship between density and deterministic complexity of /VPcomplete languages," Fifth International Symposium on Automata, Languages, and Program-
Baker, B.
ming, Udine, Italy. Bird,
M.
[1973].
"The equivalence problem
puter and Systems Sciences 7: Blattner,
M.
[1973].
2,
for deterministic
two-tape automata,"
J.
Com-
218-236.
"The unsolvability of the equality problem for sentential forms of Computer and Systems Sciences 7: 5, 463-468.
context-free grammars," J.
Blum, M. [1967]. "A machine-independent theory of the complexity of recursive functions," J.
ACM
14: 2, 322-336.
Blum, M. [1971]. "On
effective
procedures for speeding up algorithms,"
J.
ACM
18: 2,
290-305.
Boasson, L. [1973].
"Two
and Systems Sciences
Book, R. V. [1972].
1: 6,
"On
iteration
theorems
for
some
families of languages," J.
Computer
583-596.
languages accepted
in
polynomial time," SI
AM J. Computing 1: 4,
281-287.
Book, R. V. [1974]. "Comparing complexity
classes," J.
Computer and Systems Sciences
9: 2,
213-229.
Book, R. V. [1976]. "Translational lemmas, polynomial time, and (log n}7 space," Theoretical Computer Science 1: 3, 215-226. Book, R. V., and S. A. Greibach [1970]. "Quasi-realtime languages," Math. Systems Theory 4: 2, 97-111.
Book, R.
V., S.
A. Greibach, and B. Wegbreit [1970]. "Time- and tape-bounded Turing
acceptors and AFL's," J. Computer and Systems Sciences 4:
6,
606-621.
BIBLIOGRAPHY
398
Borodin, A. [1972]. "Computational complexity and the existence of complexity gaps," 19: 1, 158-174.
J.
ACM
Borosh, L, and L. B. Treybig [1976]. "Bounds on positive integral solutions of linear Diophantine equations," Proc. 55, 299-304.
AMS
W.
Brainerd,
New
and
L.
H. Landweber [1974]. Theory of Computation, John Wiley and Sons,
York.
Bruno, 3,
S.,
J. L.,
and R. Sethi [1976]. "Code generation
for a one-register
machine,"
J.
ACM 23:
502-510.
Bruss, A. R.,
and A. R. Meyer
[1978].
"On
time-space classes and their relation to the theory
ACM
of real addition," Proc. Tenth Annual
Symposium on
the
Theory of Computing, pp.
233-239.
Brzozowski,
"A survey of
A. [1962].
J.
Trans, on Electronic Computers 11:
Brzozowski,
3,
A. [1964]. "Derivatives of regular expressions," J.
J.
IEEE
regular expressions and their applications,"
324-335.
ACM
11: 4, 481-494.
—
and J. K. Millen [1972]. "Microtext the design of a microprogrammed finite-state search machine for full text retrieval," Proc. 1972 Fall Joint Computer Conference, pp. 479-488, AFIPS Press, Montvale, N.J. Bullen, R. H.,
Jr.,
Cantor, D. C. [1962]. 477-479.
"On
the ambiguity problem of
Backus systems,"
J.
ACM
9: 4,
J. Lipton, and A. R. Meyer [1976]. "Exponential space complete problems and commutative semi-groups: preliminary report," Proc. Eighth Annual ACM Symposium on the Theory of Computing, pp. 50-54. Chandler, W. J. [1969]. "Abstract families of deterministic languages," Proc. First Annual ACM Symposium on the Theory of Computing, pp. 21-30. Chandra, A. K., and L. J. Stockmeyer [1976]. "Alternation," Proc. Seventeenth Annual IEEE
Cardoza,
E.,
R.
for Petri nets
Symposium on Foundations of Computer Chomsky, N. [1956]. "Three models for mation Theory 2:3,
Chomsky, N.
IRE
Trans, on Infor-
13-124.
1
[1959].
Science, pp. 98-108.
the description of language,"
"On
certain formal properties of
grammars," Information and Control
2:2,137-167.
Chomsky, N.
[1962]. "Context-free
Rept. No. 65, pp. 187-194,
MIT
grammars and pushdown
storage," Quarterly Prog.
Res. Lab. Elect., Cambridge, Mass.
[1963]. "Formal properties of grammars," Handbook of Math. Psych., Vol. John Wiley and Sons, New York. Chomsky, N., and G. A. Miller [1958]. "Finite state languages," Information and Control
Chomsky, N.
2,
pp. 323-418,
1:
2,91-112.
Chomsky,
N.,
and M.
P.
Schutzenberger [1963]. "The algebraic theory of context
free
languages," Computer Programming and Formal Systems, pp. 118-161, North Holland,
Amsterdam. Christofides, N. [1976].
"Worst case analysis of a new heuristic for the traveling salesman New Directions and Recent Results (J. Traub, ed.), p.
problem," Algorithms and Complexity: 441,
Academic
Press,
New
York.
Church, A. [1936]. "An unsolvable problem of elementary number theory," Amer.
J.
Math.
58, 345-363.
*
Church, A. [1941]. "The Calculi of Lambda-Conversion," Annals of Mathematics Studies Princeton Univ. Press, Princeton, N.J.
6,
BIBLIOGRAPHY
Cobham,
399
A. [1964]. "The intrinsic computational difficulty of functions," Proc. 1964
Congress for Logic, Mathematics, and Philosophy of Science, pp. 24-30, North Holland,
Amsterdam. Coffman, E. G., Jr. (ed.) [1976]. Computer and Job Shop Scheduling Theory, John Wiley and Sons, New York. Cole, S. N. [1969]. "Pushdown store machines and real-time computation," Proc. First Annual ACM Symposium on the Theory of Computing, pp. 233-246. Constable, R. L. [1972]. "The operator gap," J.
Conway, Cook, S.
computers,"
J.
ACM
Cook,
S.
ACM
Symposium on
Cook,
ACM
19:
1,
175-183.
H. [1971]. Regular Algebra and Finite Machines, Chapman and Hall, London. A. [1971a]. "Characterizations of pushdown machines in terms of time-bounded J.
18:
1,
4-18.
A. [1971b]. "The complexity of theorem proving procedures," Proc. Third Annual
Theory of Computing, pp. 151-158.
the
A. [1971c]. "Linear time simulation of deterministic
S.
two-way pushdown auto-
mata," Proc. 1971 IFIP Congress, pp. 75-80, North Holland, Amsterdam. Cook, S. A. [1973a]. "A hierarchy for nondeterministic time complexity," J. Computer and
Systems Sciences
343-353.
7: 4,
Cook, S. A. [1973b]. "An observation on time-storage trade off," Proc. Fifth Annual ACM Symposium on the Theory of Computing, pp. 29-33. Cook, S. A., and R. A. Reckhow [1973]. "Time bounded random access machines," J. Computer and Systems Sciences 7: 4, 354-375. Cook, S. A., and R. Sethi [1976]. "Storage requirements for deterministic polynomial time recognizable languages," J. Computer and Systems Sciences 13: 1, 25-37. Cooper, C. D. [1972]. "Theorem proving in arithmetic without multiplication," Machine Intelligence 7 (Melzer and Mitchie, eds.), pp. 91-99, John Wiley and Sons, New York. Cremers, A., and S. Ginsburg [1975]. "Context-free grammar forms," J. Computer and Systems Sciences 11:1, 86-117.
ACM
Cudia, D. F. [1970]. "General problems of formal grammars," J. Cudia, D. F., and W. E. Singletary [1968]. "Degrees of unsolvability J.
ACM
Davis,
and Unsolvability, McGraw-Hill, New York. The Undecidable, Raven Press, New York. [1969]. "Generating parsers for BNF grammars," Proc. 1969 Spring Joint
J.
[1970].
S.,
S.,
J.
"An
efficient context-free
and C. C. Elgot
and R.
mial space," Evey,
J.
E.
J.,
[1968].
pushdown
AFIPS
and C. Rackoff J.
ACM
13: 2, 94-102.
Tarjan [1976]. 23: 4, 710-719.
addition with order," SI
M.
Comm.
Academic Press, New York. "A combinatorial problem which is complete in polyno-
[1970]. Recursiveness,
[1963]. "Application of
Ferrante,
parsing algorithm,"
14: 7, 453-460.
ACM
Conference, pp. 215-227,
Fischer,
31-43.
(ed.) [1965].
Eilenberg,
Even,
1,
[1958]. Computability
De Remer, F. L. Computer Conference, pp. 793-799, AFIPS Press, Montvale, N.J. De Remer, F. L. [1971]. "Simple LR(/c) grammars," Comm. ACM Earley,
17:
formal grammars,"
15: 4, 680-692.
M. M.
Davis,
in
AM
[1975]. J.
"A
decision procedure for the
Computing
"Grammars
store machines," Proc. 1963 Fall Joint
Computer
Press, Montvale, N.J.
4:
1,
first
order theory of real
69-76.
with macro-like productions," Proc. Ninth Annual
Symposium on Switching and Automata Theory,
pp. 131-142.
IEEE
400
BIBLIOGRAPHY
Fischer.
M.
[1969]. "Two characterizations of the context-sensitive IEEE Symposium on Switching and Automata Theory, pp.
J.
Tenth Annual
languages," Proc.
157-165.
and M. O. Rabin [1974]. "Super-exponential complexity of Presburger arithmetic," Complexity of Computation (R. M. Karp, ed.). Proceedings of SIAM-AMS Symposium in Applied Mathematics. Fischer.
M.
J.,
"On computability by certain classes of restricted Turing machines," IEEE Symp. on Switching Circuit Theory and Logical Design, pp.
Fischer. P. C. [1963].
Proc. Fourth Annual
23-32.
Fischer. P. C. [1966]. "Turing
Control 9:
364-379.
4,
Fischer. P.
for Turing machines," J. ACM 12: 4, 570-588. machines with restricted memory access," Information and
"On formalisms
Fischer. P. C. [1965].
A. R. Meyer, and A. L. Rosenberg [1968]. "Counter machines and counter
C,
languages," Math. Systems Theory 2:
C,
Fischer, P.
A. R. Meyer,
ACM
tape units," J.
Floyd. R.
W.
and A.
L.
3,
265-283.
Rosenberg [1972]. "Real-time simulation of multihead
19: 4, 590-607.
[1962a].
"On ambiguity
in
phrase structure languages," Comm.
ACM
5: 10,
526-534.
W. [1962b]. "On the nonexistence of a phrase structure grammar for ALGOL Comm. ACM 5:9, 483-484. Floyd. R. W. [1964]. "New proofs and old theorems in logic and formal linguistics," Computer Associates Inc., Wakefield, Mass. Floyd. R. W. [1967]. "Nondeterministic algorithms," J. ACM 14: 4, 636-644. Floyd. R. 60,"
Freedman, A. R., and R. E. Ladner [1975]. "Space bounds for processing contentless inputs," J. Computer and Systems Sciences 11: 1, 118-128. Friedman, A. [1975]. Logical Design of Digital Systems, Computer Science Press, Potomac,
Md. Friedman, E.
P. [1976].
puter Science
1: 4,
Friedman,
"The inclusion problem
for simple languages," Theoretical
Com-
297-316.
E. P. [1977].
"The equivalence problem for deterministic context-free languages J. Computer and Systems Sciences 14: 3, 344-359.
and monadic recursion schemes,"
Ginsburg [1974]. "Grammar schemata," J. ACM 21: 2, 312-226. S. Johnson [1976]. "Some /VP-complete geometric problems," Proc. Eighth Annual ACM Symposium on the Theory of Computing, pp. 10-22. Garey. M. R., and D. S. Johnson [1976]. "The complexity of near-optimal graph coloring," Gabriellian, A., and S.
Garey, M.
J.
ACM
R., R. L.
23:
1,
Graham, and D.
43-49.
and D. S. Johnson [1978]. Computers and Intractability: A Guide to the Theory of NP-Completeness, H. Freeman, San Francisco. Garey, M. R., D. S. Johnson, and L. J. Stockmeyer [1976]. "Some simplified NP-complete Garey, M.
R.,
problems," Theoretical Computer Science
Garey. M.
problem Gathen,
is J.,
1: 3,
237-267.
Johnson, and R. E. Tarjan [1976]. "The planar Hamilton A/P-complete," SI J. Computing 5: 4, 704-714. R.,
D.
S.
circuit
AM
and M. Sieveking [1976]. "A bound on the solutions of linear integer programs,"
Unpublished notes. Ginsburg,
S.
[1962].
puters 11:2, 132-135.
"Examples of abstract machines," IEEE Trans, on Electronic Com-
BIBLIOGRAPHY
Ginsburg,
New
The Mathematical Theory of Context-Free Languages,
S. [1966].
and Control 9: Ginsburg, S., and
[1966a]. "Deterministic context-free languages," Informa-
6,
563-582.
S.
A. Greibach [1966b].
tion
languages," Information and Control 9:
Ginsburg,
and
S.,
6,
Ginsburg, Ginsburg,
ACM
3,
14:
1,
S., S.
87,
A. Greibach [1970]. "Principal
S.
American Mathematical Society,
AFL,"
J.
Computer and Systems
A. Greibach, and
M.
A. Harrison [1967a]. "Stack automata and compiling,"
172-201. A. Greibach, and
M.
A. Harrison [1967b].
"One-way
stack automata," J.
389-418.
2,
Ginsburg,
Memoir No.
308-338.
S., S.
14:
Ginsburg,
ACM
and
S.,
Sciences 4:
"Mappings which preserve context-sensitive
563-582.
A. Greibach [1969]. "Abstract families of languages," Studies in
S.
Abstract Families of Languages, pp. 1-32, Providence, R.I.
S.,
and
J.
E. Hopcroft [1970].
"Two-way balloon automata and AFL,"
J.
ACM
3-13.
1,
Ginsburg,
ACM
S.,
and H. G. Rice [1962]. "Two families of languages related to
ALGOL,"
J.
9: 3, 350-371.
Ginsburg, like
Hill,
Algebraic and Automata-Theoretic Properties of Formal Languages,
S. [1975].
North Holland, Amsterdam. Ginsburg, S., and S. A. Greibach
17:
McGraw
York.
Ginsburg,
J.
401
S.,
and G.
Ginsburg,
S.,
guages," J.
F.
and G.
ACM
Rose [1963a]. "Some recursively unsolvable problems
ACM
languages," J.
F.
10:
1,
in
ALGOL-
29-47.
Rose [1963b]. "Operations which preserve definability
in
lan-
10: 2, 175-195.
S., and G. F. Rose [1966]. "Preservation of languages by transducers," Informaand Control 9: 2, 153-176. Ginsburg, S., and E. H. Spanier [1963]. "Quotients of context free languages," J. ACM 10: 4,487-492. Ginsburg, S., and E. H. Spanier [1966], "Finite turn pushdown automata," SI J.
Ginsburg, tion
AM
Control, 4:
Ginsburg,
3,
429-453.
S.,
and
J. S.
S.,
and
J. S.
Ullian [1966a]. "Ambiguity in context-free languages," J.
ACM
13:
1,
62-88.
Ginsburg, ity in
Ullian [1966b]. "Preservation of unambiguity and inherent ambigu-
context free languages," J.
ACM
13: 3, 364-368.
Graham, S. L. [1970]. "Extended precedence languages, bounded right context languages and deterministic languages," Proc. Eleventh Annual IEEE Symposium on Switching and Automata Theory, pp. 175-180.
Graham,
S. L.,
M.
A. Harrison, and
W.
L.
Ruzzo
[1976]. "On-line context-free language
ACM
Symposium on the Theory of Computing, pp. 112-120. Gray, J. N., M. A. Harrison, and O. Ibarra [1967]. "Two-way pushdown automata," Information and Control 11: 1-2, 30-70. recognition
in less
than cubic time," Proc. Eighth Annual
S. A. [1963]. "The undecidability of the ambiguity problem for minimal linear grammars," Information and Control 6: 2, 117-125. Greibach, S. A. [1965]. "A new normal form theorem for context-free phrase structure grammars," J. ACM 12: 1, 42-52.
Greibach,
402
BIBLIOGRAPHY
Greibach, Greibach,
ACM
Greibach,
2:
"A note on undecidable
properties of formal languages,"
AFL's and nested
iterated substitution," Information
and
7-35.
1,
A. [1973]. "The hardest context-free language," SI
S.
Math
1-6.
1,
A. [1970]. "Full
S.
Control 16:
linear context-free lan-
13: 4, 582-587.
A. [1968].
S.
Systems Theory Greibach,
"The unsolvability of the recognition of
A. [1966].
S.
guages," J.
AM
J.
Computing
2: 4,
304-310.
Greibach,
and
S. A.,
E. Hopcroft [1969].
J.
Greibach,
S. A.,
and
Systems Sciences
J.
AFL
"Independence of
Abstract Families of Languages, pp. 33-40, Society, Providence, R.I.
Memoir No.
87,
E. Hopcroft [1969]. "Scattered context
operations," Studies
in
American Mathematical
grammars,"
J.
Computer and
233-247.
3: 3,
Griffiths, T. V. [1968].
"The unsolvability of the equivalence problem
ministic generalized machines," J.
ACM
for A-free nondeter-
15: 3, 409-413.
Gross, M. [1964]. "Inherent ambiguity of minimal linear grammars," Information and Con366-368.
trol 7: 3,
Grzegorczyk, A. [1953]. "Some classes of recursive functions," Rosprawy Matematyczne Instytut
4,
Matematyczne Polskiej Akademie Nauk, Warsaw, Poland.
Haines, L. [1965]. "Generation and recognition or formal languages," Ph.D. thesis,
MIT,
Cambridge, Mass. Hardy, G.
and
H.,
M. Wright
E.
[1938].
An
Introduction to the Theory of Numbers, Oxford
Univ. Press, London.
Hartmanis,
Symposia
Applied Math.
Hartmanis, tions ," J.
ACM
16:
[1968].
J.
ACM
Hartmanis,
15: 2,
[1969].
J.
19,
325-339.
"On
the complexity of undecidable
problems
in
automata theory," J.
160-167.
1,
Hartmanis,
and Turing machine computations," Proc. American Mathematical Society, Providence, R.I. "Computational complexity of one-tape Turing machine computa-
[1967]. "Context-free languages
J.
in
and
J.,
J.
E.
Hopcroft [1968]. "Structure of undecidable problems in automata IEEE Symposium on Switching and Automata Theory, pp.
theory," Proc. Ninth Annual
327-333.
Hartmanis,
and
J.,
complexity,"
Hartmanis,
J.
E.
Hopcroft [1971], "An overview of the theory of computational 444-475.
18: 3,
and Hopcroft,
J.,
SIGACT News Hartmanis,
J.
ACM P.
J.,
J.
E. [1976].
"Independence
results in
computer
science,"
8: 4, 13-23.
M. Lewis
II,
and
computations," Proc. Sixth Annual
R. E. Stearns [1965]. "Hierarchies of
IEEE Symp.
memory
limited
on Switching Circuit Theory and Logical
Design, pp. 179-190.
Hartmanis,
J.,
and H. Shank [1968]. "On the recognition of primes by automata,"
J.
ACM
15: 3, 382-389.
Hartmanis, Trans.
J.,
AMS
and R.
E. Stearns [1965].
"On
the computational complexity of algorithms,"
117, 285-306.
Hayashi, T. [1973].
"On
derivation trees of indexed
grammars
— an extension of the uvwxy
theorem," Publications of the Research Institute for Mathematical Sciences
9:
1,
pp. 61-92.
BIBLIOGRAPHY
Hennie, F. C. [1964]. "Fault detecting experiments for sequential
Annual
circuits," Proc.
403
Fourth
IEEE Symp.
on Switching Circuit Theory and Logical Design, pp. 95-110. Hennie, F. C. [1965]. "One-tape off-line Turing machine computations," Information and Control 8:
6,
553-578.
Hennie, F. C. [1977]. Introduction to Computability, Addison-Wesley, Reading, Mass. Hennie, F. C, and R. E. Stearns [1966]. "Two-tape simulation of multitape Turing ma-
ACM
chines," J.
Herman, G.
13: 4, 533-546.
T. [1974]. "Closure properties of
some
families of languages associated with
biological systems," Information and Control 24: 2, 101-121.
Herman, G. T., and G. Rozenberg Holland, Amsterdam.
[1975]. Developmental Systems and Languages,
North
ACM
Hibbard, T. N. [1974]. "Context-limited grammars," J. 21: 3, 446-453. L. [1955]. The Wonderful World of Mathematics, Garden City Books, Garden City, N.Y.
Hogben,
Hopcroft, ton," Press,
J.
"An
E. [1971].
n log n algorithm for minimizing the states in a finite automa-
The Theory of Machines and Computations
New
Hopcroft,
(Z.
Kohavi,
ed.),
pp. 189-196,
Academic
York.
J. E.,
W.
Paul, and L. G. Valiant [1975].
J.
problems," Proc. Sixteenth Annual
IEEE Symposium
"On
time versus space and related
on Foundations of Computer Science,
pp. 57-64.
Hopcroft,
and
J. E.,
Systems Sciences Hopcroft, Bell
and
J. E.,
D. Ullman [1967a]. "Nonerasing stack automata,"
J.
J. E.,
automata,"
J.
and
ACM
Computer and
D. Ullman [1967b]. "An approach to a unified theory of automata,"
J.
System Technical
Hopcroft,
J.
166-186.
1: 2,
J. 46: 8,
1763-1829.
D. Ullman [1968a]. "Decidable and undecidable questions about
J.
15: 2, 317-324.
and J. D. Ullman [1968b]. "Deterministic stack automata and the quotient operator," J. Computer and Systems Sciences 2: 1, 1-12. Hopcroft, J. E., and J. D. Ullman [1968c]. "Sets accepted by one-way stack automata are context sensitive," Information and Control 13: 2, 114-133. Hopcroft, J. E., and J. D. Ullman [1969a]. "Some results on tape-bounded Turing machHopcroft,
ines," J.
ACM
Hopcroft, ata,
J. E.,
J. E.,
16:
1,
168-177.
and
J.
D. Ullman [1969b]. Formal Languages and Their Relation to Autom-
Addison-Wesley, Reading, Mass.
Huffman, D. A. [1954]. "The synthesis of sequential switching
circuits," J. Franklin Institute
257: 3-4, 161-190, 275-303. B., Ill [1973]. "On the time and tape complexity of languages," Proc. Fifth Annual Symposium on the Theory of Computing, pp. 10-19. Hunt, H. B., Ill, and D. J. Rosenkrantz [1974]. "Computational parallels between the regular and context-free languages," Proc. Sixth Annual ACM Symposium on the Theory of
Hunt, H.
ACM
Computing, pp. 64-74. Hunt, H. B., Ill, and D.
J.
for formal languages," J.
Hunt, H.
B., Ill,
D.
J.
Rosenkrantz [1977]. "On equivalence and containment problems
ACM
24:
3,
containment and covering problems Sciences 12:
2,
222-268.
387-396.
Rosenkrantz, and T. G. Szymanski [1976]. for regular expressions," J.
"On
the equivalence,
Computer and Systems
404
BIBLIOGRAPHY
and T. G. Szymanski [1975]. "On the complexity of grammar and related ACM Symposium on the Theory of Computing, pp. 54-65. Hunt, H. B., Ill, and T. G. Szymanski [1976]. "Complexity metatheorems for context-free grammar problems," J. Computer and Systems Sciences 13: 3, 318-334. Hunt, H. B., Ill, T. G. Szymanski, and J. D. Ullman [1975]. "On the complexity of LR(fc) Hunt, H.
B., Ill,
problems," Proc. Seventh Annual
testing,"
ACM
Comm.
18: 12, 707-715.
"A note concerning nondeterministic tape
Ibarra, O. H. [1972].
complexities," J.
ACM
19:
4,608-612. Ibarra, O. H. [1977].
"The unsolvability of the equivalence problem for free GSM's with IEEE Symposium on Foundations
unary input (output) alphabet," Proc. Eighteenth Annual of Computer Science, pp. 74-81.
Johnson, D.
S.
[1974].
"Approximation algorithms
puter and Systems Sciences 9:
Johnson,
S.
C. [1974].
3,
"YACC — yet
Murray Hill, N.J. Johnson, W. L., J. H. Porter,
for
combinatorial problems,"
J.
Com-
256-278.
another compiler compiler,"
CSTR
Labora-
32, Bell
tories,
efficient lexical
S.
I.
Ackley, and D. T. Ross [1968]. "Automatic generation of
Comm.
analyzers using finite state techniques,"
Jones, N. D. [1975]. "Space-bounded reducibility
puter and Systems Sciences 11:
1,
among
ACM
11: 12, 805-813.
combinatorial problems,"
J.
Com-
68-85.
Comput ability Theory: an Introduction, Academic Press, New York. T. Laaser [1976]. "Complete problems for deterministic polynomial time," Theoretical Computer Science 3: 1, 105-118. Jones, N. D., E. Lien, and W. T. Lasser [1976]. "New problems complete for nonJones, N. D. [1973].
Jones, N. D., and
W.
deterministic log space," Math. Systems Theory 10: Jones, N. D., and S.
ACM
24:
Kannan,
2,
R.,
Muchnick
S.
[1977].
1,
1-17.
"Even simple programs are hard
to analyze," J.
338-350.
and C.
L.
Monma
[1977].
"On
the computational complexity of integer pro-
gramming problems," Report 7780-0R, Inst, fur Operations Research, Univ. Bonn, Bonn, West Germany. Karp, R. M. [1972]. "Reducibility among combinatorial problems," Complexity of Computer Computations, pp. 85-104, Plenum Press, N.Y. Karp, R. M. [1977]. "The probabilistic analysis of some combinatorial search algorithms," Algorithms and Complexity:
Academic
Press,
New
New
Directions and Recent Results
(J.
Traub,
ed.),
pp. 1-20,
York.
Kasami, T. [1965]. "An
efficient recognition
guages," Scientific Report
AFC RL-65-758,
and syntax algorithm
for context-free lan-
Air Force Cambridge Research Lab., Bedford,
Mass. T., and K. Torii [1969]. "A syntax analysis procedure for unambiguous contextgrammars," J. ACM 16: 3, 423-431. Kirkpatrick, D. G., and P. Hell [1978]. "On the completeness of a generalized matching problem," Proc. Tenth Annual ACM Symposium on the Theory of Computing, pp. 240-245. Kleene, S. C. [1936]. "General recursive functions of natural numbers," Mathematische Annalen 112, 727-742.
Kasami,
free
BIBLIOGRAPHY
Van Nostrand,
Kleene,
S.
C. [1952]. Introduction to Metamathematics, D.
Kleene,
S.
C. [1956]. "Representation of events in nerve nets and
finite
405
Princeton, N.J.
automata," Autom-
ata Studies, pp. 3-42, Princeton Univ. Press, Princeton, N.J.
Knuth, D. E. [1965]. "On the translation of languages from 6,
Knuth, D.
E., J.
SI
AM
H. Morris,
Computing
J.
left
to right," Information and
607-639.
Control 8:
6: 2,
Jr.,
and V. R. Pratt
[1977]. "Fast pattern
matching
Kohavi, Z. [1970]. Switching and Finite Automata Theory, McGraw-Hill, Korenjak, A.
ACM
"A
[1969].
J.
in strings,"
323-350.
practical
method
New
York.
for constructing LR(/c) processors,"
Comm.
12: 11, 613-623.
Korenjak, A.
and
J.,
E. Hopcroft [1966]. "Simple deterministic languages," Proc. Seventh
J.
IEEE Symposium
Annual
Kosaraju,
S. R. [1974].
Kosaraju,
S. R. [1975].
on Switching and Automata Theory, pp. 36-46.
"Regularity preserving functions,"
"Context
free
SIGACT News
6: 2, 16-17.
preserving functions," Math. Systems Theory 9:
3,
193-197.
Kozen, D. [1976]. "On parallelism in Turing machines," Proc. Seventeenth Annual IEEE Symposium on Foundations of Computer Science, pp. 89-97. Kozen, D. [1978]. "Indexing of subrecursive classes," Proc. Tenth Annual ACM Symposium on the Theory of Computing, pp. 287-295. Kuroda, S. Y. [1964]. "Classes of languages and linear bounded automata," Information and Control 7:
207-223.
2,
"On
Ladner, R. E. [1975a].
the structure of polynomial time reducibility," J.
ACM
22:
1,
155-171.
Ladner, R. E. [1975b]. "The circuit value problem
News
7:
Ladner, R. ties,"
Landweber,
P. S. [1963].
mation and Control 6:
SIGACT
Landweber,
ACM
"Three theorems on phrase structure grammars of type
problems of phrase structure grammars,"
P. S. [1964]. "Decision
and
ACM
M. E. Murray Hill,
[1975].
Lesk,
13,
Infor-
IEEE
Trans,
354-362.
Seiferas [1977].
J.
Ninth Annual
1,"
131-136.
2,
on Electronic Computers B.,
log-space complete for P,"
N. Lynch, and A. Selman [1974]. "Comparison of polynomial time reducibiliSymposium on the Theory of Computing, pp. 1 10-121.
E.,
Proc. Sixth Annual
Leong,
is
18-20.
1,
"New
Symposium on
"LEX— a
real-time simulations of multihead tape units," Proc.
the Theory of Computing, pp. 239-240.
lexical analyzer generator,"
CSTR
39, Bell Laboratories,
N.J.
Levin, L. A. [1973]. "Universal sorting problems," Problemi Peredachi Informatsii 9:
3,
265-266. Lewis,
Annual
J.
M.
[1978].
ACM
"On
the complexity of the
Symposium on
Lewis, P. M.,
II,
D.
J.
maximum subgraph
problem," Proc. Tenth
the Theory of Computing, pp. 265-274.
Rosenkrantz, and R. E. Stearns [1976]. Compiler Design Theory,
Addison-Wesley, Reading, Mass. Lewis, P. M.,
II,
and R.
E. Stearns [1968]. "Syntax directed transduction," J.
ACM
15: 3,
465-488. Lewis, P. M.,
II,
R. E. Stearns,
and
J.
Hartmanis [1965]. "Memory bounds
for recognition of
and context-sensitive languages," Proc. Sixth Annual IEEESymp. on Switching Circuit Theory and Logical Design, pp. 191-202.
context-free
BIBLIOGRAPHY
406
Lindenmayer, A. [1968]. "Mathematical models for cellular interactions in development, I and II," J. Theor. Biol 18, 280-315. Lindenmayer, A. [1971]. "Developmental systems without cellular interaction, their lanparts
guages and grammars,"
Theor. Biol 30, 455-484.
J.
Machtey, M., and P. R. Young [1978]. An Introduction to the General Theory of Algorithms, North Holland, New York. Mager, G. [1969]. "Writing pushdown acceptors," J. Computer and Systems Sciences 3: 3, 276-319.
Maibaum,
"A
T. S. E. [1974].
Systems Sciences
generalized approach to formal languages," J. Computer and
409-439.
8: 3,
McCreight, E. M., and A. R. Meyer [1969]. "Classes of computable functions defined by bounds on computation," Proc. First Annual ACM Symposium on the Theory of Computing, pp. 79-88.
McCulloch, W.
S.,
and W.
Pitts [1943].
activity," Bull
Math. Biophysics
McNaughton,
R.,
ata,"
IEEE
and H. Yamada
[
"A
logical calculus of the ideas
immanent
in
nervous
115-133.
5, 1
960].
"Regular expressions and state graphs for autom-
Trans, on Electronic Computers 9:
1,
39-47.
Mealy, G. H. [1955]. "A method for synthesizing sequential circuits," Bell System Technical J. 34: 5, 1045-1079.
Meyer, A.
R.,
and R. Ritchie [1967]. "The complexity of loop programs," Proc.
ACM Natl
Confi, pp. 465-469.
Meyer, A.
R.,
and
L.
J.
Stock meyer [1973]. "The equivalence problem for regular expres-
sions with squaring requires exponential space," Proc. Thirteenth Annual
on Switching and Automata Theory, pp. 125-129. Miller, G. L. [1976]. "Riemann's hypothesis and Systems Sciences 13:
Minsky, M.
3,
tests for primality," J.
Computer and
300-317.
"Recursive unsolvability of Post's problem of 'tag' and other topics
L. [1961].
the theory of Turing machines," Annals of Math., 74:
Minsky, M.
IEEE Symposium
3,
in
437-455.
Computation: Finite and Infinite Machines, Prentice Hall, Englewood
L. [1967].
Cliffs, N.J.
Minsky, M.
L.,
and
S.
Papert [1966]. "Unrecognizable sets of numbers,"
J.
ACM
13: 2,
281-286.
Moore,
E. F. [1956].
"Gedanken experiments on sequential machines," Automata
Studies,
pp. 129-153, Princeton Univ. Press, Princeton, N.J.
Moore,
E. F. (ed.) [1964]. Sequential
Machines: Selected Papers, Addison-Wesley, Reading,
Mass. Myhill,
J.
[1957]. "Finite
automata and the representation of events,"
AFB, Ohio. "Linear bounded automata,"
WADD TR-57-624,
pp. 112-137, Wright Patterson
Myhill,
J.
Patterson
Naur,
[I960].
WADD
TR-60-165, pp. 60-165, Wright
AFB, Ohio.
P. et al [I960].
299-314, revised
in
"Report on the algorithmic language
Comm.
ACM
6:
1,
ALGOL 60," Comm. ACM 3: 5,
1-17.
Nerode, A. [1958]. "Linear automaton transformations," Proc.
AMS,
9,
pp. 541-544.
BIBLIOGRAPHY
Oettinger, A. G. [1961]. "Automatic syntactic analysis and the
pushdown
407
store," Proc.
Symposia in Applied Math. 12, American Mathematical Society, Providence, R.I. Ogden, W. [1968]. "A helpful result for proving inherent ambiguity," Math. Systems Theory 2: 3,
191-194.
Ogden, W. [1969]. "Intercalation theorems for stack languages," Proc. First Annual ACM Symposium on the Theory of Computing, pp. 31-42. Oppen, D. C. [1973]. "Elementary bounds for Presburger arithmetic," Proc. Fifth Annual ACM Symposium on the Theory of Computing, pp. 34-37.
Papadimitriou, C. H. [1976]. 544-554.
"On
the complexity of edge traversing," J.
Papadimitriou, C. H., and K. Steiglitz [1977]. traveling salesman problem," SI
Parikh, R.
[1966].
J.
M. C, and
"On
AM J.
"On
Computing
ACM
23:
the complexity of local search for the
6:
context-free languages," J.
1,
76-83.
ACM
13: 4, 570-581.
H. Unger [1968]. "Structural equivalence of context-free grammars," Computer and Systems Sciences 2: 4, 427-468.
Paull,
3,
S.
J.
W. J. [1977]. "On time hierarchies," Proc. Ninth Annual ACM Symposium on the Theory of Computing, pp. 218-222. Paul, W. J., R. E. Tarjan, and J. R. Celoni [1976]. "Space bounds for a game on graphs," Proc. Eighth Annual ACM Symposium on the Theory of Computing, pp. 149-160. Plaisted, D. A. [1977], "Sparse complex polynomials and polynomial reducibility," J. Com-
Paul,
puter and Systems Sciences 14: Post,
[1936]. "Finite
E.
2,
210-221.
combinatory processes-formulation,
I,"
J.
Symbolic Logic,
1,
103-105. Post, E. [1943]. J.
Math.
"Formal reductions of the general combinatorial decision problem," Amer.
65, 197-215.
Post, E. [1946].
"A
variant of a recursively unsolvable problem," Bull.
Pratt, V. R. [1975].
"Every prime has a succinct
certificate,"
SI
AM
A MS, 52, 264-268. Computing 4: 3,
J.
214-220. Pratt, V. R.,
machines,"
J.
and L. J. Stockmeyer [1976]. "A characterization of the power of vector Computer and Systems Sciences 12: 2, 198-221.
Rabin, M. O. [1963]. "Real-time computation," Israel
J.
Math.
1: 4,
203-21
1.
and Complexity: New Directions and Recent Results (J. Traub, ed.), pp. 21-40, Academic Press, New York. Rabin, M. O., and D. Scott [1959]. "Finite automata and their decision problems," IBM J. Rabin,
M. O.
Res. 3:
2,
[1976]. "Probabilistic algorithms," Algorithms
115-125.
RackofT, C. [1978]. "Relativized questions involving probabilistic algorithms," Proc. Tenth
ACM
Symposium on the Theory of Computing, pp. 338-342. and W. J. Savitch [1975]. "The Turing degree of the inherent ambiguity problem for context-free languages," Theoretical Computer Science 1: 1, 77-91. Rice, H. G. [1953]. "Classes of recursively enumerable sets and their decision problems," Annual
Reedy,
A.,
Trans.
AMS
Rice, H.
89, 25-59.
G. [1956].
"On
completely recursively enumerable classes and their key arrays,"
Symbolic Logic 21, 304-341.
J.
BIBLIOGRAPHY
408
W.
Ritchie, R.
[1963]. "Classes of predictably
computable functions," Trans.
AMS
106,
139-173.
Rogers, H.,
Jr.
[1967].
New
McGraw-Hill,
Rosenkrantz, D. mars,"
ACM
J.
J.
14:
Rosenkrantz, D.
ACM
16:
J.
The Theory of Recursive Functions and
Effective Computability,
York. [1967]. 3,
"Matrix equations and normal forms
gram-
for context-free
501-507.
[1969].
"Programmed grammars and
classes of formal languages," J.
107-131.
1,
Rosenkrantz, D.
J.,
and R.
E. Stearns [1970]. "Properties of deterministic
mars," Information and Control 17:
Rosenkrantz, D.
R. E. Stearns,
J.,
3,
top-down gram-
226-256.
and
P.
M.
Lewis,
AM
salesman problem," SI Rounds, W. C. [1970]. "Mappings and grammars on heuristics for the traveling
II
[1977].
"An
analysis of several
Computing 6: 3, 563-581. trees," Math. Systems Theory J.
4: 3,
257-287.
Ruby,
and
S.,
P. C. Fischer [1965]. "Translational
IEEE Symp. on
Proc. Sixth Annual
methods and computational complexity,"
Switching Circuit Theory and Logical Design, pp.
173-178.
"Computationally related problems," SI AM J. Computing 3: 4, 262-279. and T. Gonzalez [1976]. "P-complete approximation problems," J. ACM 23: 3,
Sahni,
S. [1974].
Sahni,
S.,
555-565.
Sakoda, W.
J.,
M.
"Nondeterminism and the size of two-way finite Theory of Computing, pp. 275-286. "Two complete axiom systems for the algebra of regular events," J.
and
Sipser,
automata," Proc. Tenth Annual
Salomaa, A. [1966].
ACM
13:
[1978].
ACM Symposium on the
158-169.
1,
Salomaa, A. [1973]. Formal Languages, Academic Press, N.Y. Savitch, W. J. [1970]. "Relationships between nondeterministic and deterministic tape complexities," J. Computer and Systems Sciences 4: 2, 177-192. Savitch,
W.
[1972].
J.
"Maze recognizing automata,"
Proc. Fourth Annual
ACM Symposium
on the Theory of Computing, pp. 151-156. Schaefer, T.
"Complexity of decision problems based on finite two-person perfect ACM Symposium on the Theory of Computing, pp.
[1976].
J.
information games," Proc. Eighth Annual 41-49. Schaefer, T.
[1978].
J.
Scheinberg, tion
"The complexity of satisfiability problems," Proc. Tenth Annual
the
S.
and Control
3: 4,
372-375.
Schutzenberger, M. P. [1963].
mation and Control 6: Seiferas,
J. I.
Seiferas,
J. I.
[1974].
J.
"On
context-free languages
and pushdown automata,"
"A note on 14:
Infor-
246-264. prefixes of regular languages,"
SIGACT News 6:
1,
1,
25-29.
Computer
73-99.
[1977b]. "Relating refined complexity classes," J. Computer and Systems
I.
Sciences 14:
3,
[1977a]. "Techniques for separating space complexity classes," J.
and Systems Sciences Seiferas,
ACM
Theory of Computing, pp. 216-226. [I960]. "Note on the Boolean properties of context-free languages," Informa-
Symposium on
1,
100-129.
J. Fischer, and A. R. Meyer [1973]. "Refinements of nondeterministic time and space hierarchies," Proc. Fourteenth Annual IEEE Symposium on Switching and Autom-
Seiferas, J.
I.,
M.
ata Theory, pp. 130-137.
BIBLIOGRAPHY
409
I., and R. McNaughton [1976]. "Regularity preserving relations," Theoretical Computer Science 2: 2, 147-154. Sethi, R. [1975]. "Complete register allocation problems," SI AM J. Computing 4: 3,
Seiferas, J.
226-248.
Shannon, C. E. [1956]. "A universal Turing machine with two internal
states,"
Automata
Studies, pp. 129-153, Princeton Univ. Press, Princeton, N.J.
Shannon, C.
and
E.,
McCarthy
J.
(eds.) [1956].
Automata
Studies, Princeton Univ. Press,
Princeton, N.J.
C. [1959]. "The reduction of two-way automata to one-way automata,"
Sheperdson,
J.
J. Res. 3: 2,
198-200.
Solovay,
and V. Strassen
R.,
Computing
6:
1,
84-85.
A
Springsteel, F. N. [1976].
[1977].
correction
"On
the
"A
1,
Monte Carlo
3,
test for primality,"
SI
AM
J.
p. 118.
pre-AFL of
guages," Theoretical Computer Science 2: Stanley, R.
fast
ibid. 7:
IBM
[log n] space
and related families of lan-
295-304.
[1965]. "Finite state representations of context-free languages," Quarterly
J.
MIT
Prog. Rept. No. 76, 276-279, Stearns, R. E. [1967].
"A
Res. Lab. Elect.,
regularity test for
Cambridge, Mass.
pushdown machines," Information and Control
11:3,323-340. Stearns, R. E.,
and
J.
Hartmanis
[1963]. "Regularity preserving modifications of regular
expressions," Information and Control 6:
Stockmeyer, L. 3,
1,
55-69.
[1973]. "Planar 3-colorability
is
polynomial complete,"
SIGACT News 5:
19-25.
Stockmeyer, logic,"
"The complexity of decision problems in automata theory and MAC, MIT, Cambridge, Mass. "The polynomial time hierarchy," Theoretical Computer Science 3:
L. J. [1974].
MAC
Stockmeyer, 1,
J.
TR-133, Project L. J. [1976].
1-22.
Stockmeyer,
L.
and A. R. Meyer [1973]. "Word problems requiring exponential space," ACM Symposium on the Theory of Computing, pp. 1-9. H. [1975a]. "A note on tape-bounded complexity classes and linear contextJ.,
Proc. Fifth Annual
Sudborough, free
I.
languages," J.
Sudborough,
ACM
22:
4,
499-500.
"On tape-bounded complexity
classes and multihead finite Computer and Systems Sciences 10: 1, 62-76. Suzuki, N., and D. Jefferson [1977]. "Verification decidability of Presburger array programs," A Conference on Theoretical Computer Science, pp. 202-212, Univ. of Waterloo, Waterloo, Ont., Canada.
automata,"
I.
H. [1975b].
J.
Taniguchi, K., and T. Kasami [1976]. "A result on the equivalence problem for deterministic pushdown automata," J. Computer and Systems Sciences 13: 1, 38-50.
ACM
11: 6,419-422. [1968]. "Regular expression search algorithm," Comm. Trakhtenbrot, B. A. [1964]. "Turing computations with logarithmic delay," Algebra i
Thompson, K.
33-48.
Logika, 3:
4,
Turing, A.
M.
[1936].
"On computable numbers
problem," Proc. London Math. Soc,
Ullman,
J.
Sciences 10:
2: 42,
with an application to the Entscheidungs-
230-265.
A
correction,
D. [1975]. "./VP-complete scheduling problems," 3, 384-393.
J.
ibid., 43,
pp. 544-546.
Computer and Systems
410
BIBLIOGRAPHY
pushdown automTheory of computation Report No. 1, Univ. of Warwick, Coventry, Great Britain. Valiant, L. G. [1974]. "The equivalence problem for deterministic finite-turn pushdown automata," Information and Control 25: 2, 123-133. Valiant, L. G. [1973]. "Decision procedures for families of deterministic
—
ata,"
Valiant, L. G. [1975a]. "General context-free recognition in less than cubic time," J.
puter and Systems Sciences 10:
2,
Valiant, L. G. [1975b]. "Regularity ata," J.
ACM
22:
Valiant, L. G.,
1,
Com-
308-315.
and related problems
for deterministic
pushdown autom-
1-10.
and M.
S.
Paterson [1975]. "Deterministic one-counter automata," 3, 340-350.
J.
Com-
puter and Systems Sciences 10:
Wang, H.
[1957].
"A
variant to Turing's theory of
computing machines,"
J.
A CM
4:
1,
63-92.
Wegbreit, B. [1969]. Sciences 3:
Wise, D.
3,
S. [1976].
Computer Science
Yamada, H. table,"
IEEE
"A generator
of context-sensitive languages," J. Computer and Systems
456-461.
"A strong pumping lemma
3: 3,
for context-free languages," Theoretical
359-370.
[1962]. "Real-time
computation and recursive functions not real-time compu-
Trans, on Electronic Computers 11:6, 753-760.
Yannakakis, M. [1978]. "Node and edge deletion NP-complete problems," Proc. Tenth Annual ACM Symposium on the Theory of Computing, pp. 253-264. Yasuhara, A. [1971]. Recursive Function Theory and Logic, Academic Press,
Younger, D. H. [1967]. "Recognition and parsing of context-free languages Information and Control 10: 2, 189-208.
New in
York.
time n\"
INDEX
Aanderaa,
Backus,
S. O., 319
Accepting state (See Final
Ackermann's function, Ackley, S.
Adelman,
AFL
I.,
L.,
Baker, B.
state)
Beeri,
269, 394
L., 375
Blank, 148
268 Algorithm, 146-147
Blattner, M., 395
Blum, M., 319 Blum's axioms, 313 Blum's speed-up theorem, 308-310
Alphabet, 2
Ambiguous grammar,
87, 200, 255
(See also Inherent ambiguity)
Ancestor, 4
Boasson, L., 145 Book, R. V., 216, 319, 375 Boolean expression, 324-325 Borodin, A., 319 Borodin's gap theorem, 306-307 Borosh, I., 374
A., 54
Arc, 2
Arden, D. N., 54 7
A-transducer, 282 Auxiliary
216
Bird, M., 76, 284
106, 124, 227, 268,
ALGOL,
Asymmetry, Atom, 355
C,
Berman,
284, 394
M.
S.,
Bar-Hillel, Y., 76, 145, 216
(See Abstract family of languages)
Arbib,
W., 106
Baker, T., 376
175
46, 54 375-376
Aho, A. V., 54, 75,
J.
Backus-Naur form, 78
Abstract family of languages, 277-280
pushdown automaton, 377-381,
385, 393
Brainerd,
W.
Bruno,
L., 374
J.
S., 176
Bruss, A. R., 375
Axiomatic complexity, 312-314 Axt, P., 319
Brzozowski,
J.
A., 54
Bullen, R. H., Jr., 54
411
412
INDEX
Canonical order, 168-169
C,
Cantor, D.
106,
216
Containment property, 189 Context-free grammar/language,
Cardoza, E., 376 Cartesian product, 5 Celoni,
R., 319
J.
CFG/CFL
Containment, of sets, 5 Containment problem, 203 9,
77-
145, 115-120, 189, 203-206, 228,
(See Context free
246-247, 270, 281, 395
grammar/language) Chandler, W. J., 284
Context-sensitive grammar/language,
Chandra, A. K., 376 Checking off symbols, 155-156
Conway, Cook, S.
Chomsky, N.,
106, 123, 145, 216, 232
Chomsky hierarchy, 217-232 Chomsky normal form, 92-94 Christofides, N., 375
Chromatic number problem, 341, 369 Church, A., 176 Church's hypothesis, 147, 166 Circuit value problem, 370 Clique cover problem, 365 Clique problem 365, 369 Closure (See Kleene closure)
223-228, 270, 346-347, 390, 393 J. H., 54 A., 124, 176, 306, 319, 350,
374-375, 394 Corasick, M.
J.,
54
Countable set, 6 Counter machine, Cremers, A., 284
123, 171-173
Crossing sequence, 38, 314-315, 318
CSG/CSL
{See Context-sensitive
grammar/language) Cudia, D. F., 216
CYCLE 72, 142-144, 281 CYK algorithm, 139-141
Closure, of a relation, 8
Closure properties, 59-63, 130-136, 230, 233, 235-247, 270-284, 365, 392
Davis. M., 176
Cobham,
Dead
Cocke,
A., 374
J.,
{See also
CYK
state,
236
Decidable problem, 178
145
algorithm)
Code generation problem, 367 Coffman, E. G., 374
{See also UndeciJable problem)
Decision properties, 63-65, 137-141, 230, 247, 281, 393, 395
Cole, S. N., 269
DeRemer,
Compiler, 122-123
Derivation, 80, 84-87, 220, 389
Complementation, 59,
135, 179-180, 204,
235-239, 279, 281, 342, 392
Complete problem, 323-354 Complexity class, 288 Computable function, 9-10 Computation, of a Turing machine, 201202, 211-212
Concatenation, 1-2, 28, 59, 131, 230-231, 246, 266, 278, 280 (See also Instantaneous description)
74
Conjunctive normal form, 325, 328
CovV^, 341-342 Constable, R. L., 319
Deterministic
CSL, 229
Deterministic
finite
automaton,
19
{See also Finite automaton) Deterministic language (See Deterministic
pushdown automaton) pushdown automaton, 121
113, 121, 233-269, 281
DFA
{See Deterministic
finite
automaton)
Configuration, 379
relation,
Derivation tree, 82-87
Descendant, 4
Deterministic
Computational complexity, 285-376
Congruence
F. L., 268
Diagonalization, 182-183,364
Digraph (See Directed graph) Directed graph, 2 (See also Transition diagram) Distinguishable states, 68-69
INDEX
Domain, 6
Finite
DSPACE (See Space complexity) DTIME (See Time complexity) Dyck
automaton,
413
67-71, 250-
9, 13-54,
253 Finite control, 17
{See also State)
language, 142
Finiteness problem, 63-64, 137-138, 189,
370 Earley,
J.,
145
Finite state system, 13-14
Edge, 2
(See also Finite automaton)
Effective closure, 59, 205 Effective procedure [See Algorithm)
Emptiness problem, 63-64, 348-349
Empty Empty
28
M.
Fischer,
1
12,
1
string, 1-2, 28
Encoding, of a Turing machine, 181-
Full
Epsilon-CLOSURE,
GSM,
GSM
25
AFL,
Full time/space constructibility,
297-299
homomorphism.
Gabriellian, A., 284
Gap theorem 270, 278.
280
(See Borodin's gap
theorem)
Garey, M. R., 374-375
Epsilon-move, 24-27, 239. 264 Epsilon-production, 90-92 Equivalence class,
7.
11-12. 66-67
Equivalence problem, 64-65. 281 7, 65-66
Equivalence relation, Equivalent states. 68 Euclid, 8
Even, S., 375 Evey. J., 123-124 Exact cover problem, 341. 367
Gathen, J., 374 Generalized sequential machine, 272-276 (See also
(See Finite automaton)
Family of languages. 270 {See also Abstract family of languages. Trio)
Father, 4
Fermat's conjecture, 179 J.,
375
GSM
mapping. Inverse
GSM
mapping) Gill, J.,
376
Ginsburg,
S., 76, 106, 124,
145, 216. 232.
267, 269, 284, 394
Godel. K.. 147. 354, 371 Gonzalez. T.. 375 Graham, R. L., 374
Graham.
Grammar
Ferrante,
280
(See also Abstract family of languages)
272
mapping. 272, 274,
Epsilon-free regular set, 271
FA
319
Full trio, 270-277, 280
280 Epsilon-free
124, 176, 267,
Formal language (See Language) Freedman, A. R., 319 Friedman, A. D., 54 Friedman, E. P., 269
182
Endmarker, 51, 166, 377, 381 Enumeration, 167-170, 189. 228 Epsilon-free
232, 319, 375, 394
14-
254
Epsilon-free
J.,
C,
Floyd, R. W., 106, 145, 216
stack, acceptance by.
115,
Empty
137, 189, 281,
143
First-order theory, 354
Fischer, P. set,
PDA,
Finite-turn
S. L., 145
(See Context-free grammar.
Context-sensitive grammar. Regular
grammar. Type 0 grammar) Graph, 2 Gray, J. N.. 124 Greibach. S. A.. 106. 124, 216, 232, 267. 269, 284. 319, 394-395
Greibach normal form, 94-99
FIN, 282
Greibach* s theorem, 205, 393
Final state. 17. 110. 148. 272
Griffiths, T. V.,
284
INDEX
414
Gross, M., 106, 216
Interleaving, 282
Grzegorczyk, A., 319 GSM (See Generalized sequential machine)
Intersection, 5, 59, 134, 204, 281, 283
GSM
Intractable problems, 320-376
mapping, 272-274, 280
Intersection with a regular set, 135-136. 246, 270, 278, 280, 283, 392 Invalid computation (See Computation)
Haines, L., 267
GSM
Inverse
Halting Turing machine, 149, 215
Hamilton circuit problem, 332-336 Handle, 249 Hardy, G. H., 57 Harrison, M. A., 124, 145, 394 Hartmanis, J., 76, 124, 145, 176, 216,
mapping, 272, 276, 280,
392
homomorphism,
Inverse
61
,
132-133,
230-231, 246, 270, 278, 280, 283 Inverse substitution, 142 Irreflexivity, 7
Item, 248-252, 261
319. 375
Hayashi. T., 395
Jefferson, D., 375
Hell. P.. 374
Johnson, D.
Hennie.
F.
C,
Herman. G.
76, 176, 319
Johnson,
S.
S.,
C,
374-375 268
Johnson, W. L., 46, 54 Jones. N. D., 176, 375
T.. 395
Hibbard. T. N., 232 Hilbert. D., 147
Hogben.
Kannan.
L., 8
Homomorphism, 60-61,
132, 230-231,
246. 266, 270, 278, 280, 283
Honesty theorem, 316 Hopcroft,
J.
E., 75-76, 124, 216, 227,
267-269, 284, 319, 374, 394
R., 374
Karp, R. M.. 374-375
Kasami, T..
145,
CYK
(See also
269 algorithm)
Kernel, 368 Kirkpatrick. D. G., 374
C,
Huffman, D. A., 54, 76 Hunt. H. B., Ill, 216, 374, 376
Kleene, S.
Ibarra. O. H.. 124. 319
A-limited erasing (See Limited erasing)
ID {See Instantaneous description) Independence of operators, 279, 282
Knuth, D. E., 54, 268 Kohavi, Z., 54 Korenjak, A. J., 268-269 Kosaraju. S. R., 76 Kozen, D.. 376 Kuroda, S. Y., 232
Indexed language, 389-391. 393 Inductive hypothesis, 4 Infinite set. 6
Inherent ambiguity, 99-103
INIT, 72.
54, 176, 216
Kleene closure, 278. 280
28, 59. 131, 246. 266,
142, 280, 282
Initial state,
17,
110, 272
(See also Start state)
Laaser,
W.
T., 375
Ladner, R. E., 319, 375-376
Input tape, 377, 389
Landweber, L. H., 176, 232 Language. 2, 18, 80-81, 112. 149. 168 LBA {See Linear bounded automaton)
Instance, of a problem, 177
Left-linear
Input alphabet,
17,
110, 272
Input symbol, 148
Instantaneous description, 36, III, 148149
Integer linear programming, 336-340 Interior vertex, 4
grammar
(See Regular
grammar) Left-matching, 39
Leftmost derivation. 87 Length, of a string. I
INDEX
MIN.
Leong, B., 176 Lesk, M. E., 46, 54 J.
P.
M., 374 M., II,
72, 142, 244-245, 279, 281
finite automata, 67-71 Minsky, M. L., 54, 76, 176, 216
Minimization of
Levin, L. A., 375
Lewis, Lewis,
415
Modified PCP, 195-196 106, 124, 176, 269, 319,
Monma,
C. L., 374
Moore, E.
375
284
F., 54, 76,
Lexical analyzer, 9, 54
Moore machine, 42-45
Lien, E., 375
Morris.
Limited erasing, 275, 280
Move.
Lindenmayer, A., 395 Linear bounded automaton, 225-226 Linear grammar/language, 105, 143,214, 283-284 Linear programming (See Integer linear programming)
Multidimensional Turing machine,
J.
H..
Jr.,
54
149 1
64-
165
Multihead Turing machine, 165 Multitape Turing machine, 161-163 Myhill.
J.,
76. 232
Myhill-Nerode theorem, 65-67
Linear speed-up, 289-291 Lipton, R.
J.,
376
Literal, 325
Naur.
Logspace-complete problem, 347, 349350 Logspace reduction, 322-323 Logspace transducer, 322
Nerode, A., 76 {See also Myhill-Nerode theorem)
P., 106
Neuron net, 47 Next move function,
NFA
Lookahead set, 261 Loop program, 317 LR (0) grammar, 248-260, 267-268
Node
LR
Nondeterministic
item (See Item)
LR(k) grammar, 260-264 L-system, 390-393 Lynch, N., 376
148
{See Nondeterministic
finite
automaton) (See Vertex) finite
automaton, 19-33
Nondeterministic space complexity, 288, 300-305 Nondeterministic space hierarchy, 304305
Machtey, M., 176 Mager, G., 394
Nondeterministic time complexity, 288,
Maibaum,
Nondeterministic time hierarchy, 306 Nondeterministic Turing machine, 163-
300, 354-362
T. S. E., 395
Manders, K., 375-376
Many-one reduction, 212
MAX,
164,
McCarthy, J., 54 McCreight, E. M., 319 McCulloch. W. S., 54
McNaughton,
387, 393
R., 54, 76
Mealy, G. H., 54 Mealy machine, 42-45 Membership problem, 139-141, 281
Millen,
J.
124, 176, 319,
K., 54
Miller, G. L., 232, 375
Nonterminal {See Variable) Normal form PDA, 234 XP. 320-324, 362-365 NP-complete problem, 324-343 NP-hard problem, 324 NSPACE (See Nondeterministic space complexity)
Metalinear language, 143
Meyer, A. R.,
288
Nonerasing stack automaton, 382, 385-
72, 142, 244-245, 281
375-376
NTIME
(See Nondeterministic time
complexity)
Number
theory, 354, 371
6
INDEX
416
Oettinger, A. G., 123 Off-line Turing
Prefix property, 121, 260
machine,
166, 174,
285-
286
Prenex normal form, 355 Presberger arithmetic, 354, 371
Ogden, W., 145, 394 Ogden's lemma, 129-130 One-way stack automaton, 382, 389, 393 Operator grammar, 105 Oppen. D. C, 375
Principal
Oracle, 209-213, 362-365, 371
Proper prefix/suffix,
Output alphabet, 42-43, 272 Output tape, 167
Proper subtraction, 151 Property, of languages. 188
Primes, 57-58, 342-343 Primitive recursive function, 175
AFL,
PSPACE, 0>,
11-12, 105
Papadimitriou, C. H., 374
324, 343-
PSPACE-hard problem. 324 Pumping lemma, 55-58. 72, 125-128,
143,
394-395
Papert, S., 76
Parikh. R.
321
347
Pair generator, 169 2,
I
PSPACE-complete problem,
320-324, 362-365
Palindrome,
283
Problem, 177 Production, 77-79
Push, 235, 381
145
J.,
Pushdown automaton.
Parser, 9, 268
Parse tree (See Derivation tree) Partial recursive function, 151 Partial solution to
PCP,
197
9.
107-124. 264
(See also Deterministic pushdown
automaton)
Pushdown
transducer, 124
Partition problem, 341
Paul,
W.
,!.,
M.
Paul!,
Quantified Boolean formula. 343-346
319
C
Quotient, 62-63. 142, 244, 276-277, 280,
106
P-complete problem 347-349 PCP (See Post's correspondence
Rabin, M.
problem)
PDA
(See
392
Pushdown automaton)
Rackoff,
Perfect squares, 57
RAM
Perles. M., 76, 145, 216
Random
Phrase structure grammar (See Type 0
Range.
().,
C,
(See
54, 319. 375
375
Random
access machine)
access machine. 166-167
Reachability problem. 349-350
grammar) Pippenger, N., 319
Reals with addition, 354-362
W., 54 Polynomial time reduction, 322
Reckhow,
Pop, 235, 382
Recursive function,
Pitts,
Porter.
J.
H., 46, 54
Positive closure, 28, 230-231, 278, 280
(See also Kleene closure) Post, E., 176, 216
Post's correspondence problem, 193-201
Post tag system, 213
Power
set, 5
Pratt. V. R., 54,
375-376
Predecessor, 2 Predicting machine, 240-243 Prefix,
1
R. A., 176
Recursion theorem, 208 175,
207-209
(See also Partial recursive function) Recursive language, 151. 169-170, 179181, 210, 227-228, 270-271
Recursively enumerable language. 150, 168-169, 180-192, 210, 228. 230, 270
Reduction, 321-324 (See also 212-213)
Reedy, A., 216 Refinement, of an equivalence relation, 65
Reflexive and transitive closure. 8
INDEX
Sentential form, 81, 143, 389, 395
Reflexivity, 7
Regular expression,
9,
28-35, 51, 350-
Set, 5-6
353, 370 Regular grammar, 217-220
Set former, 5
Regular
Shamir, E., 76, 145, 216 Shank, H., 145
set, 18,
55-76, 142, 189, 203,
205, 218, 228, 246, 270-271, 277-278,
280-281
Sethi, R., 374-375
Shannon, C. E., 54, 176
Relation, 6-8
Shannon switching game, 370, 372-374
C,
Reversal, 71-72, 142, 281
Shepherdson,
Rice, H. G., 106, 145, 216
Shifting over symbols, 156-157
Rice's theorem, 185-192
J.
54
Shuffle, 142
grammar (See Regular
Right-linear
grammar)
Sieveking, M., 374
Right-matching, 39
Simple grammar, 229 Simulation, 364
Rightmost derivation, 87
Singletary,
Right sentential form, 249
Singleton, 192
W., 319
Ritchie, R.
Rogers, H., Root,
Jr.,
E., 216
Solovay, R., 375-376 Son, 4
3
Rosenberg, A. L., 124, 176 Rosenkrantz, D. J., 106, 216, 269, 374376, 395
300-319, 343-353, 384-385, 387-388, 393
Rozenberg, G., 395 Ruby, S., 319 Ruzzo, W. L., 145
Spanier. E. H., 76, 145
Spanning-tree problem, 367 Speed-up (See Blum's speed-up theorem,
(See Stack automaton)
Salomaa, A.,
Stack. 107, 378, 389
106, 395
problem, 325-331, 370
Satisfiability
Linear speed-up) Springsteel, F. N., 375
Sahni, S., 375
W.
Space-bounded Turing machine, 285 Space complexity, 285-289, 295-298,
Space constructibility, 297 Space hierarchy, 297-298
Ross, D. T., 46, 54
Savitch,
W.
S mn theorem. 207
176
Rose, G. F., 76, 124, 145, 216, 284
SA
J.,
216, 319, 375
Stack alphabet,
1
10
Stack automaton, 381-389, 393
Savitch's theorem, 301-302
Stanley. R.
Scattered-context grammar/language,
Start state, 148 (See also Initial state)
J.,
374, 376
P.,
1
Stearns, R. E., 76, 106, 124, 176, 247, 267. 269. 319, 375 106, 123, 216,
Steiglitz, K.,
374
Stockmeyer, L.
267 Scott, D., 54 J.
Storage
L. 76, 176, 319
Self-embedding grammar, 229 Selman, A., 376 Semilinear
set,
10
State, 17, 110, 148, 272, 377
Scheduling problem, 367 Scheinberg, S., 145 Schutzenberger, M.
145
J.,
Start symbol,
282-283 Schaefer, T.
Seiferas,
417
72
Semi-Thue system {See Type 0 grammar)
J.,
375-376
in finite control,
153-
154
Strassen, V., 375 String,
1
Strong connectivity problem, 370
SUB.
282
Subroutine, 157-158
INDEX
418
Substitution, 60-61, 131-132, 230-231,
277-278, 280-281, 283
Suffix,
I.
H., 375
1
Suzuki, N., 375
Switching
circuit, 13,
Symbol, Symmetry,
9,
146-176, 179, 181-
201-204, 221-223, 285-319
183, 193,
Turing reduction, 212-213 Two-stack machine, 171
Successor, 2
Sudborough,
Turing machine,
47
Two-tape finite automaton, 74 Two-tape Turing machine, 292-295 Two-way finite automaton, 36-42
Two-way Two-way
1
7
Syntactic category (See Variable)
Szymanski, T. G., 374, 376
infinite tape, 159-161
nondeterministic
finite
automaton, 51
Two-way pushdown automaton, Type 0 grammar, 220-223
121, 124
Taniguchi, K.. 269
Tape,
Tape Tape Tape Tape Tape
17. 36.
148, 378
Ullian,
compression, 288-289 head, 36 reduction, 289. 292-295
symbol. 148
Tarjan, R. E., 319, 374-375
Terminal, of a grammar, 77, 79, 389
106,
J. S.,
Ullman,
alphabet, 173
J.
216
D., 54. 75, 106, 124, 216, 227,
267-268, 284, 319, 376, 394
Uncountable set, 6 Undecidable problem, 178-216 Union, 5, 28, 59, 131, 180, 230, 246, 265, 278, 280 Union theorem, 310-312
Text editor, 46
Unit production, 91
Thompson,
Universal Turing machine, 181-185
K., 46, 54
Three-satisfiability, 330-331
Time-bounded Turing machine, 286 Time complexity, 286-295, 299-300, 307,
Unrestricted
grammar {See Type 0
grammar) Useless symbol, 88-89
313. 320-343, 378-381, 393
Time Time
constructibility, 299
hierarchy, 295. 299, 303
Valiant, L. G., 145, 247, 269, 319
Valid computation (See Computation)
Torii, K.. 145
Valid item, 249
Total recursive function, 151
Variable, of a grammar, 77, 79, 389
Track, of a tape, 154
Vertex, 2
Trakhtenbrot. B. A., 319
Vertex cover problem, 331-332
Transition diagram, 16
Viable prefix, 249-252
Transition function,
17,
48
Transition table, 383
Wang, H.,
Transitive closure (See Reflexive and
Wegbreit, B., 232
transitive closure)
176
Wise, D. S., 145
Transitivity, 7
Word,
Translation lemma, 302-305
Wright, E. M., 57
1
Traveling salesman problem. 341, 368 Tree, 3-4 (See also Derivation tree)
Treybig, L. B., 374 Trio, 270-277, 280
Truth-table reduction, 214
Turing, A. M., 176, 216
Yamada, H.,
54, 319
Yannakakis, M., 374 Yasuhara, A., 176
Young, P. R., 176 Younger, D. H., 145 (See also
CYK
algorithm)