Вы находитесь на странице: 1из 5

Good afternoon.

Spring has hit


California. You can see that I am wearing
one of my Hawaiian shirts unlike the
button downs that I recorded the lectures
in, so everything is feeling good and in
this video, I'd like to discuss a few
questions that have arisen, in the
discussion forum, that I think may
represent common misconceptions. First, I
want to remind everyone to be aware of the
types of things that we discussed. Strings
versus characters and so on. Then I want
to point out a subtle difference between
an automaton accepting a string and
accepting a language. And finally there
are a few interesting edge effects that we
need to understand. You may have heard of the
Zen Koan, what is the sound of one hand
clapping? I don't know, but here we face a
harder problem under the sound of no hands
clapping, I want to remind people what it
means for example to compute the sum of
zero integers. We're familiar with types
or classes from work with most any modern
programming language. The things we talked
about in automata theory also have
types, and confusing one type for another
is just as dangerous here as it is in
programming. Two distinctions I want to
emphasize today are first, between
characters and strings, and then between
sets and the members of those sets. A
string is a sequence of zero or more
characters. In most programming languages,
double quotes are placed around the
string. And single quotes around the
character. The distinction is most
important when the string is of length one
because then it looks just like a
character if you don't do something like
double quoted to indicate its type. Most
importantly, epsilon is the way we
represent the empty string. In most
languages, the way to represent the empty
string is by double quotes with nothing in
between them like this. And I think most
confusion occurred when people noticed
that in an epsilon NFA, some arcs are
labeled by the string epsilon while others
are labeled by input characters or input
symbols. Previously, DFAs and ordinary
NFAs had all arcs labeled by characters
but epsilon is not a character and yet we
appeared to use it where you expected an
input symbol, that is, a character to
appear. But there is no real problem,
since a character can be coerced to be a
string of length one. It's in fact very
natural in programming languages. For
example in Java, an easy way to convert
characters to strings is to write the
empty string concatenated with that
character. Since characters are coerced to
strings, the character, say, zero, is
converted to the string zero and
concatenated with the empty string which
leaves just the original character, but
converted to a string of length one. So
for an Epsilon NFA, just think of the
character's labeled in the arcs as strings
of length one. Then you can concatenate
the epsilons and characters along any path
and get a string naturally. As with any
kind of automaton, the sequence of labels
along any path is of type string. Here's
a pretty picture of a path in an
epsilon NFA. It doesn't matter whether the
states are distinct or if some repeat
along the path. You just concatenate the
labels Epsilon, the string zero, the
string one, the string epsilon again, string
zero. The concatenation of these five strings
then is the string 010. Now let's move on
to sets and elements. Elements come in
many different types. For example
integers, strings, characters and so on.
Sets can have elements as members, so for
example a language in the world of a
automata theory is set of elements of
type string. That is, the type of a
language is: set of strings. And remember
that epsilon is a string. While the empty
set is a set. There is a notion of
membership that relates the set to the
members of that set. A set can have
members. The element types like string do
not have members. Technically the members
of a set can be sets themselves but we're
not going to talk about such sets much.
Occasionally, we mention the power set of
a set S whose members are the each of
the subsets of the set S and thus are
sets themselves. But, in most cases you
can assume that sets have members that
are elements and not sets. The empty set
is a set but it has no members. All other
sets do have one or more members. And
elements do not have members either for
example epsilon has no members, but the
reason epsilon has no members is because
it is of a type for which having members
makes no sense. We must not confuse the
empty string with the empty set. They each
have no members, but for very different
reasons and their types are not the same.
Let's see how the set element distinction
applies to states of a non-deterministic
finite automaton. First, states of any
automaton are elements not sets. The subset
construction seems to create a DFA who
states, are sets of the states of the NFA.
But really, what we're doing is computing
sets of NFA states and giving each one a
name. This name is the name of a DFA
state and each DFA state corresponds to a
set of NFA states, but the DFA state is an
element with an associated value and the
value is the set of NFA states. We can
even use the set of NFA states as the name
of the DFA state but we should then
understand the notation for the set like
the set containing p and q, is this, is
a string that is the name of a DFA
state. Here's another point that has
caused some confusion, Automata accept
strings. The labels of the paths that lead
from the start state to an accepting
state. But we also said that an automaton
accepts a language. This language is the
set of strings that the automaton accepts.
If we say automaton A accepts language L
then we mean that L consists of all and
only the strings that A accepts. Thus,
while the typical automaton accepts an
infinite number of different strings, it
accepts exactly one language, the set of
strings that it accepts. Let's remember
that the phrase "automaton A accepts
language L" means that L is the one
language of A. A accepts all the strings
in L and A does not accept any other
strings. The extreme case of the confusion
would be an automaton like the one shown
which accepts every string, but only
accepts one language, the one we refer
to as "zero one star". That's this. That is,
it's the language of all strings of zeros
and ones. If you don't see the difference
between accepting strings and accepting a
language, you would erroneously conclude
that this automaton accepts every language
whatsoever such as the set of zero to the
n, one to the n such that n is equal to or
greater than one. It doesn't. It accepts this
language, the set of all strings of zeroes
and ones. I now want to talk about a point
of confusion that comes up in several
places. What happens when you try to apply
an operation to zero things? For example,
we know what it means to sum two integers
or ten integers but what if we were asked
to sum zero integers or if I have a string
of zeroes and ones, it makes sense to ask
whether the string has an even or odd
number of zeroes but if the string is
empty? So for example, four + seven +
three = fourteen, no problem. If I just
want to sum four + seven, cross out the
three, again no problem that's eleven. If
I just want to sum the four, no seven
there, fine: that sum is four. Now if I
take the four away, what's left? What is
its sum? I claim that in general, we should
take the operation applied to nothing to
be the identity for that operation. For
sum, the identity is zero. That is, zero
plus anything is that other thing. This is
a well accepted convention. I can't prove to
you... that the sum of zero things *must* be
the identity zero, but neither have I seen a
reasonable justification for any other
convention and if you have one it would be
a great topic for the discussion forum of
the class. And it is natural
and intuitive, as we can see If we write the
obvious quote to sum the n elements of
an array. Here's what the code looks like:
you initialize the sum to zero, there, and
then go through the array in a loop. And
you go through n times. You'll get the
correct sum for any n equal to or greater than one
but what happens if n happens to be zero?
If that's the case, you never execute the
loop. You just jump right around it and
the sum is left at zero. And initializing
the sum to anything but zero makes no
sense and would not give the correct
result for n equal to or greater than one
unless you did serious contortions to the
code. Here are some other examples where
the identity for the operator is the only
thing that makes sense. The OR of zero
propositional variables is false because
false is the identity for OR, that is
false OR x = x. Similarly, the AND of no
propositions is true because true AND x = x.
The product of no integers or reals
is one because one * x = x. And if we
concatenate zero strings, we should get
the empty string since epsilon
concatenated with any string x is x. As an
example of why the convention for strings
make sense, look at an automaton where the
start state is also an accepting state.
This automaton has more to it, I just
didn't draw it. There is a path from a to
a that goes through no arcs. It is of
length zero. In general, the string that
labels a path is the concatenation of
all the strings that come from the input
symbols along that path. Again, remember
that the characters labeling the arcs are
converted to strings for concatenation.
But in this case, the path has no arcs so
its label must be the concatenation of
zero strings, that is, the empty string.
That's great because it means that the
empty string is accepted by this automaton
and since the automaton starts out in an
accepting state before it reads any input,
that makes sense. Next, let's look at the
question of whether zero is odd or even.
Curiously, this question causes a great
deal of disagreement. For example, I
remember one day in 1973 when we had gas
rationing, you could only fill up on a day
whose number was even, if the digits on
your license plate formed an even number.
And ditto for odds. But, what if your
license plate had no digits? A vanity
plate like LOVER, okay. Since
legislatures are not Mathematicians,
different states have different policies.
Some treating a plate like this as odd and
others as even. But we know that zero is
even because it leaves remainder of zero
when divided by two. That is zero is two
times an integer namely zero in this case
plus a remainder of zero. And anything
that leaves, that is of the form two times any
integer + zero is even. Now, the empty
string has zero of every character and
zero is even. So if we ask a question
like, does the empty string contain an
even number of some character like zero?
The answer is yes and if we ask whether it
has an odd number of zero, the answer is
no, okay. Here's a bonus answer, okay. A
few people actually use automaton and
automata correctly. Let's learn to be
pedants and use them right, okay.
Automaton is a singular noun and its
plural, an irregular plural because of its
Greek origin is automata, one automaton,
two automata. But it gets worse, okay?
Everybody says a automata theory, never
the singular form, automaton theory. But
other theories use a singular form of the
noun that describes what they are about.
For example, physicists talk about String
Theory or Quantum Theory, never Strings
theory or Quantums theory well, that
should be Quanta Theory anyways since
Quantum is another irregular plural and
don't ask me why. Anyway, see you in
class.

Вам также может понравиться