Index

The Set of All Possible Finite Tuples

Suppose that

X

is a set, then we denote the set of all possible finite tuples as

X^{*} = ⋃_{n \in ℕ_{0}} X^{n}

The Set of All Strings Over a Finite Alphabet Is Countable

TODO: Add the content for the proposition here.

There are only

{| X |}^{n}

possible strings of length

n

therefore the above is a countable union of finite sets and thus countable.

Note that in the context of computers we will usually refer to tuples as strings, so instead of writing something like $(0, 1, 2, 3, 4)$ we instead just write "01234" to represent the same thing.

Finite Automaton

A finite automaton is a 5-tuple

Q, Σ, δ, q_{0}, F, I

where

$Q$ is a finite set called the states
$Σ$ is a finite set called the alphabet
$I = Σ^{*}$ is the set of possible inputs
$δ : Q \times Σ \to Q$ is a finite set called the transition function
$q_{0} \in Q$ is the start state
$F \subseteq Q$ is the set of accept states

Note that when $k = 0$ then we get $()$ as a possible input, which it the empty tuple, we denote this as $ϵ$ .

Machine

A machine is a finite automaton.

Output of a Finite Automaton on an Input

Suppose that

M

is a machine and

a = (a_{1}, \dots, a_{k})

is a finite sequence of elements from

M_{Σ}

then we define the following:

if $1 \leq n \leq k$ then $o_{n} = δ (o_{n - 1}, a_{n})$
$o_{0} = q_{0}$

and define the output of

M

after processing

a

o_{k}

which is denoted by

out (M, a)

Compute Sequence of an Input for a Finite Automaton

Suppose that

M

is a machine and

a = (a_{1}, \dots, a_{k})

is a finite sequence of elements from

M_{Σ}

, then we define its compute sequence as

compseq (M, a) = (o_{0}, o_{1}, \dots, o_{k})

and note that

| compseq (M, a) | = | a | + 1

A Finite Automaton Accepts an Input

Suppose that

M

is a machine and

a = (a_{1}, \dots, a_{k})

is a finite sequence of elements from

M_{Σ}

then we say that

M

accepts

a

iff:

out (M, a) \in M_{F}

A Finite Automaton Rejects an Input

Suppose that

M

is a machine and

a = (a_{1}, \dots, a_{k})

is a finite sequence of elements from

M_{Σ}

then we say that

M

accepts

a

iff it does not accept it, and we write

acc (M, a)

Language

A language

L

is simply a set of tuples.

Language Legible for a Machine

Suppose that

M

is a finite automaton, and that

L

is a language, we say that

L

is legible for

M

L \subseteq M_{I}

If a langauge is not legible by a particlular machine, then there will exists an input $x$ in the language such that $out (M, x)$ is not well defined, thus in those cases we cannot say much about the language.

Language of a Finite Automaton

lang (M) = {a \in M_{I} : acc (M, a)}

A Finite Automaton Models a Language

Given a legible language

A

for

M

, we say that

M

models

A

iff

A = lang (M)

DFA That Recognizes Everything Except 11 and 111

Give a state diagram of a DFA that regonizes the following language over the alphabet

Σ = {0, 1}

{w \in Σ^{*} : w \notin {11, 111}}

We claim that the following DFA works:

DFA That Regonizes Only 0 and the Empty String

Give a state diagram of a DFA that regonizes the following language over the alphabet

Σ = {0, 1}

{0, ϵ}

Dfa That Recognizes Even Number of Zeros or Exactly Two 1s

Give a state diagram of a DFA that regonizes the following language over the alphabet

Σ = {0, 1}

{w \in alltup (Σ) : even (count (w, 0)) \lor count (w, 1) = 2}

Regular Language

Suppose that

L

is a language, then we say that it is regular iff there exists a finite automaton

M

that recognizes it. We denote this by

reg (L)

, in symbols that is:

reg (L) ⟺ \exists M, lang (M) = L

Concatenation of Two Languages

A \circ B = {concat (x, y) : x \in A, y \in B}

Concatenation of Languages Is Associative

(A \circ B) \circ C = A \circ (B \circ C)

TODO: Add the proof here.

Star of a Language

A^{*} = {concat (x_{1}, \dots, x_{k}) : k \in ℕ_{0}, x_{i} \in A}

Note that when $k = 0$ then an empty concatenation is defined as $ϵ$ so that the star of any lanuage will always contain $ϵ$ .

Nondeterministic Finite Automaton

A nondeterministic finite automaton is a 6-tuple

Q, Σ, δ, q_{0}, F, I

where

$Q$ is a finite set called the states
$Σ$ is a finite set called the alphabet
$I = {(a_{1}, \dots, a_{k}) : k \in ℕ_{0}, a_{i} \in Σ}$ is the set of possible inputs
$δ : Q \times Σ_{ϵ} \to P (Q)$ is the transition function
$q_{0} \in Q$ is the start state
$F \subseteq Q$ is the set of accept states

A NFA is shorthand for the above, to differentiate we call a finite automaton as a DFA where the D stands for deterministic. Additionally sometimes when drawing these out, then you'll notice that sometimes for a given state there is not an arrow for each character in the alphabet, whenever this is the case it means that $δ (q, a) = \emptyset$ which means that there is no transition from starting at $q$ and reading $a$ , this means that an NFA can get "stuck" this is when it is not possible for the NFA to read in the next state in any possible context.

Multiple State Transition Function

Given an NFA

N

c \in Σ_{N}

and any

R \subseteq Q_{N}

we define the following overloaded function:

δ (R, c) = ⋃_{r \in R} δ (r, c)

One visual that helps me remember the above is like you're trakcing the whereabouts of multiple trains in a subway station, you want to get on one of the trains as fast as possible, so you see what stations they'll be at next, so you can get on at the nearest station, tha above takes in the collection of train stations they're at now, and tells you where they'll be next.

Epsilon Reachable States

Suppose that

N

is an NFA and

R \subseteq Q_{N}

then we define the set of reachable states from

R

by travelling along 0 or more epsilon arrows by:

ϵ (R)

Note that $ϵ (A \cup B) = ϵ (A) \cup ϵ (B)$ and that $ϵ (R) \supseteq R$

Epsilon Reachable Multiple State Transition Function

Given an NFA

N

c \in Σ_{N}

and any

R \subseteq Q_{N}

we define

δ^{ϵ} (R, c) = ϵ (δ (R, a))

Every DFA Is an NFA

Every DFA can trivially be converted to an equivalent NFA

Suppose we have a DFA, then construct an NFA as follows

$Q^{'} = Q$
$Q^{'} = Q$
$Σ^{'} = Σ$
$Σ^{'} = Σ$
$δ^{'} (q, a) = {δ (q, a)}$
$q_{0}^{'} =$

Then given a string $s \in Σ^{*}$ then by induction we can prove that $| compseq (D, s) | = | compseq (N, s) |$ and for any valid $k \in ℕ_{1}$ $compseq {(D, s)}_{k} = compseq {(N, s)}_{k}$ , thus $out (D, s) = out (N, s)$ so we can conclude that so $acc (D, s) ⟺ acc (N, s)$ showing that $lang (D) = lang (N)$

Conversion of an NFA to a DFA

Convert the following NFA to a DFA:

As per the conversion process, we set

Q^{'} = 𝒫 (Q)

, and then for any

R \in Q^{'}

and

c \in Σ

we define our transition function by

δ^{ϵ} (R)

$δ^{ϵ} ({1, 2}, a) = ϵ ({1, 3}) = {1, 2, 3}$
$δ^{ϵ} ({1, 2}, b) = ϵ (\emptyset \cup \emptyset) = \emptyset$
$δ^{ϵ} ({1, 2, 3}, a) = {1, 2, 3}$
$δ^{ϵ} ({1, 2, 3}, b) = ϵ (\emptyset \cup \emptyset \cup {2, 3}) = {2, 3}$
$δ^{ϵ} ({2, 3}, a) = ϵ ({1, 2}) = {1, 2}$
$δ^{ϵ} ({2, 3}, b) = {2, 3}$
$δ^{ϵ} (\emptyset, a) = \emptyset = δ^{ϵ} (\emptyset, b)$

Note that we could analyze what it does to other states, but they will not be reachable in the final diagram and thus can be removed, this is true because we explored the graph starting at the root following the BFS algorithm which we know will entirely explore all reachable states, thus here is the final simplified NFA:

Every Language Modelled by a DFA Is Modelled by an NFA

As per title.

Since for every DFA, there is a trivial conversion to an equivalent NFA, then if a language

L

is modelled by some DFA D, so that

lang (D) = L

then since given our eqiuvalent NFA

N

which is to say that

lang (N) = lang (D)

then we conclude that

lang (N) = L

showing that

N

models

L

Machine

A machine is a DFA or a NFA

Two Machines Are Equivalent

We say that

M_{1}, M_{2}

are equivalent iff

lang (M_{1}) = lang (M_{2})

Every Language Recognized by an NFA Is Recognized by a DFA

As per title.

TODO: Add the proof here.

A Language Is Regular Iff There Is an NFA That Recognizes It

reg (A) ⟺ \exists N, lang (N) = A

Suppose that there exists an NFA $N$ such that $rec (N, A)$ , then by the above theorem there exists some DFA $D$ such that $rec (D, A)$ therefore $reg (A)$ .

The other direction is simpler since $reg (A)$ then there exists some DFA D such that $rec (D, A)$ but a DFA is an NFA so then we've shown that there is some NFA that recognizes $A$ as needed.

Because this is true it means that a language is regular iff there is an NFA or DFA which recognizes it.

Regular Languages Are Recognized Under Intersection

Suppose that

reg (A)

and

reg (B)

then

reg (A \cap B)

This is simply true beucase

A \cap B \subseteq A, B

therefore both machines will recognize a subset of a language they already recognize.

Regular Languages Are Closed Under Union

Suppose that

reg (A)

and

reg (B)

then

reg (A \cup B)

TODO: Add the proof here.

Regular Languages Are Closed Under Concatenation

Suppose that

reg (A)

and

reg (B)

then

reg (A \circ B)

TODO: Add the proof here.

Regular Languages Are Closed Under the Star Operation

Suppose that

reg (A)

then

reg (A^{*})

TODO: Add the proof here.

Regular Languages Are Closed Under Complementation

reg (A) ⟹ reg (Σ^{*} ∖ L)

TODO: Add the proof here.

Regular Languages Are Closed Under Intersection

reg (A) \land reg (B) ⟹ reg (A \cap B)

TODO: Add the proof here.

Reversal of a Lanuage

Suppose that

A

is a language, then we define the reverse of the language by

A^{ℛ} = {rev (a) : a \in A}

Regular Languages Are Closed Under Reversal

reg (A) ⟹ reg (A^{ℛ})

TODO: Add the proof here.

Homomorphism From One Alphabet to Strings Over Another

A homomorphism from one alphabet to strings over another is a function

f : Σ ⟶ Γ^{*}

from one alphabet to strings over another alphabet. We can extend

f

to operate on strings by defining

f (w) =

f (w_{1}) f (w_{2}) \dots f (w_{n})

, where

w = w_{1} w_{2} \dots w_{n}

and each

w_{i} \in Σ

. We further extend

f

to operate on languages by defining

f (A) = {f (w) | w \in A}

, for any language

A

It was defined as above to allow you to map single characters from an alpabet to strings of another to be more general from the get go, but you can certainly have homomorphisms from characters of one alphabet to characters of another alphabet just the same.

Regular Languages Are Closed Under Homomorphism

Suppose that

f

is a homomorphism, then

reg (A) ⟹ reg (f (A))

TODO: Add the proof here.

Regular Languages Are Closed Under the Inverse Image of a Homomorphism

Suppose that

f

is a homomorphism, then

reg (A) ⟹ reg (f^{- 1} (A))

TODO: Add the proof here.

Regular Languages Are Closed Under Perfect Shuffle

For languages

A

and

B

, let the perfect shuffle of

A

and

B

be the language

perfectshuffle (A, B)

defined as:

{w | w = a_{1} b_{1} \dots a_{k} b_{k} where a_{1} \dots a_{k} \in A and b_{1} \dots b_{k} \in B each a_{i}, b_{i} \in Σ}

Show that

reg (A) \land reg (B) ⟹ reg (perfectshuffle (A, B))

Instead of making an explicit construction we will instead use the fact that regular languages are closed under intersection and homomorphisms to prove that the shuffle is regular.

But now if we consider $X = {left}^{- 1} (A)$ , then where the last condition $a_{1} \dots a_{n} \in A$ comes about because we require that $left ((a_{1}, b_{1}) \dots (a_{n}, b_{n})) = left (a_{1} \dots a_{n}, b_{1} \dots b_{n}) = a_{1} \dots a_{n} \in A$ Note that since $left$ is a homomorphism and $reg (A)$ then we have $reg ({left}^{- 1} (A))$ equivalently $reg (X)$

Analgously we find that $Y = {right}^{- 1} (B) = {(a_{1}, b_{1}) \dots (a_{n}, b_{n}) : n \in ℕ_{0}, (a_{i}, b_{i}) \in Σ \times Σ s.t. b_{1} \dots b_{n} \in B}$ is regular. Now since we have $X \cap Y = {(a_{1}, b_{1}) \dots (a_{n}, b_{n}) : n \in ℕ_{0}, (a_{i}, b_{i}) \in Σ \times Σ s.t. a_{1} \dots a_{n} \in A, b_{1} \dots b_{n} \in B}$ Then the homomorphism $unpack : {(Σ \times Σ)}^{*} \to Σ^{*}$ defined as $unpack ((a, b)) = a b$ (it is a homomorphism, because $unpack ((a, b) (c, d)) = a c b d$ and you can check the rest), is interesting because it threads them one by one, so that $unpack (X \cap Y) = perfectshuffle (A, B)$ Since we know that $reg (X)$ and $reg (Y)$ and that regular languages are closed under homomorphism and intersection we conclude that $perfectshuffle (A, B)$ is regular.

The Language of Binary Digits Divisible by a Constant Is Regular

Suppose that

n \in ℕ_{1}

, then the language

C_{n} = {x : x is a binary number such that, n | x}

is regular.

Before we continue we cast away two trivial cases, when $n = 1$ , then since every number is divisible by 1, then we construct a trivial DFA with one state which is the accept state and all arrows point back towards itself. The case of $n = 2$ can also be disposed of quite quickly as we can construct a two state dfa, where we start in the accept state, and if we read a one, we got to or stay in the non-accept state, if we read a zero we go back and stay inside the accept state, this will work because our DFA reads from left to right, and the last digit of a binary number is at the right, and a binary number is odd iff the last digit is one, therefore this DFA would only ever be in the non-accept state if the last read digit is one, as needed.

We construct a DFA as follows, we have states $q_{0}, q_{1}, \dots, q_{n - 1}$ where $q_{i}$ will be the state when the currently read in binary number $b$ has remainder $i$ mod $n$ that is $b % n = i$ , to make this claim true, we define our transition function as follows $δ (q_{j}, 0) = q_{(2 \cdot j) % n}$ and $δ (q_{j}, 1) = q_{(2 \cdot j + 1) % n}$

We do this because when we read the next binary number it pushes all previous number one power higher (ie multiply by two), and then based on if that new digit is one or zero it adds on as well.

Note that given a number $k \in ℕ_{1}$ then its true that $(a k + b) % n = ((a % n) (k % n) + b % n) % n$ , since in this case we know that $n \geq 3$ and actually $a = 2$ and $b \in {0, 1}$ then $a % n = a$ and $b % n = b$ so we have $(2 k + b) % n = (2 (k % n) + b) % n$

This shows that the remainder of multiplying $k$ by 2 and then either adding $0$ or $1$ is the same as multiplying $k % n$ by 2 and then adding $0$ or $1$ , by induction this property holds when iterated, thus this shows that our construction of the transition function actually satisfies the claim that $q_{i}$ is the state when the currently read in binary number $b$ has remainder $i$ mod $n$ , therefore after reading the entire number if it has remainder $r$ mod $n$ the DFA will be on state $q_{r}$ thus by only making $q_{0}$ the accept state our DFA is correct, and so $C_{n}$ is regular.

Regular Expression

We say that a language

R

is a regular expression over an alphabet

Σ

R

equals to:

${a}$ for some $a$ in the alphabet $Σ$
${ϵ}$
$\emptyset$
The union of two regular expressions $R_{1}, R_{2}$ , denoted by $(R_{1} \cup R_{2})$
The concatentation of two regular expressions $R_{1}, R_{2}$ denoted by $(R_{1} \circ R_{2})$
The star of a regular expression $R_{1}$ denoted by $(R_{1}^{*})$

The Set of All Regexes Over an Alphabet

Suppose that

Σ

is an alphabet, then we denote all possible regexes over

Σ

RX (Σ)

Regular Expression Shorthand

Suppose that

R_{1}, \dots R_{k} \in RX (Σ)

, then we develop the following shorthand to write out regular expressions easier:

$(R_{1} \circ R_{2} \circ \dots \circ R_{k}) = R_{1} R_{2} \dots R_{k}$

A Regular Expression Is a Regular Language

For any language

L

L \in RX (Σ) ⟹ reg (L)

Given a regular expression $L \in RX (Σ)$ we show that there is an NFA $N$ that models $L$ thus showing that $L$ would be regular.

Since the definition of a regular language is recursive by nature we split cases on the possibilities and complete the proof using structural induction.

The first case is if $R = {a}$ for $a \in Σ$ , if that's the case then a two state NFA N with a start state on the left, and an accept state on the right, with a single transition when the letter $a$ is read will work, as $lang (N) = {a}$ . Formally we have $N = ({q_{1}, q_{2}}, Σ, δ, q_{1}, {q_{2}})$ where $δ (q_{1}, a) = {q_{2}}$ and $δ (r, b) = \emptyset$ for $r \neq q_{1}$ or $b \neq a$

If $R = {ϵ}$ then $N = ({q_{1}}, Σ, δ, q_{1}, {q_{1}})$ with $δ (r, b) = \emptyset$ for any $r, b$ , then $lang (N) = {ϵ}$

If $R = \emptyset$ then $N = ({q}, Σ, δ, q, \emptyset)$ with $δ (r, b) = \emptyset$ for any $r, b$

Now by structural induction suppose that $R_{1}, R_{2} \in RX (Σ)$ such that $reg (R_{1})$ and $reg (R_{2})$ .

If $R = R_{1} \cup R_{2}$ then since regular languages are closed under union, we have that $reg (R)$ .

If $R = R_{1} \circ R_{2}$ , since regular languages are closed under concatenation then $reg (R)$

If $R = R_{1}^{*}$ , since regular languages are closed under the star operation then $reg (R)$

Thus by structural induction we can conclude that the statement holds true.

Generlized Nondeterministic Finite Automaton

A generalized nondeterministic finite automaton is a 6-tuple

Q, Σ, δ, q_{0}, F, I

where

$Q$ is a finite set called the states
$Σ$ is a finite set called the alphabet
$I = {(a_{1}, \dots, a_{k}) : k \in ℕ_{0}, a_{i} \in Σ}$ is the set of possible inputs
$δ : (E ∖ {q_{accept}}) \times (Q ∖ {q_{start}}) \to RX (Σ)$ is the transition function
$q_{start} \in Q$ is the start state
$q_{accept} \in Q$ is the accept state

For Every Dfa There Is an Equivalent Gnfa

TODO: Add the content for the lemma here.

TODO: Add the proof here.

The Language of Any GNFA Equals Some Regular Expression

For any GNFA G, there exists some

L \in RX (Σ)

such that

lang (G) = L

TODO: Add the proof here.

If a Language Is Regular Then It Is a Regular Expression

Suppose that

L

is a language over

Σ

, then

reg (L) ⟹ L \in RX (Σ)

TODO: Add the proof here.

A Language Is Regular Iff It Equals a Regular Expression

Let

Σ

be an alphabet, and let

L

be a language over that alphabet, then

reg (L) ⟺ \exists R \in RX (Σ), L = R

TODO: Add the proof here.

Pumping

reg (A)

then there exists

p \in ℕ_{1}

such that if

s \in A

such that

| s | \geq p

there exists

x, y, z

such that

s = x y z

where

for every $i \in ℕ_{0}$ we have $x y^{i} z \in A$
$| y | > 0$
$| x y | \leq p$

Let $M = (Q, Σ, δ, q_{1}, F)$ be a DFA recognizing $A$ then we set $p = | Q |$ . Now suppose that $s \in A$ such that $| s | \geq p$ , since $| compseq (M, s) | = | s | + 1$ then we know that $| compseq (M, s) | > p$ , so that by the pigeonhole principle there must exist a duplicated state $q_{d} \in compseq (M, s)$ .

Thus $compseq (M, s) = (q_{0}, \dots, q_{b}, q_{d}, \dots q_{d}, \dots, q_{k})$ where we note that $q_{k} \in F$ as $rec (M, L)$ , so now let $s = x y z$ where $q_{d}$ is the first instance of the duplicate $compseq (M, x) = (q_{0}, \dots, q_{d})$ for $y$ the part between the duplicates: $compseq (M, y, q_{d}) = (q_{d}, \dots, q_{d})$ for $z$ the part after the duplicate: $compseq (M, z, q_{d}) = (q_{d}, \dots, q_{k})$

Now we claim that the three properties hold true, for the first property let $i \in ℕ_{0}$ and so if $i = 0$ , then we know that we're looking at the string $x z$ which is accepted as $compseq (M, x z) = concat (compseq (M, x), compseq (M, z, q_{d}) [1 :])$ because on the sequence on the right ends in a accept state. Whenever $i \geq 1$ then in a similar manner $compseq (M, x y^{i} z) = concat (compseq (M, x), {(compseq (M, y, q_{d}) [1 :])}^{i}, compseq (M, z, q_{d}) [1 :])$ for the same reason, the sequence on the right ends in an accept state, so the first property holds true.

The reason why $| y | > 0$ is because in order to move to another state you must process a character, since we encounter $q_{d}$ twice, it means that we must read at least one character to do this.

We know that it's guarenteed that within the first $p + 1$ states there must be a repeated state due to the pigeon hole principle, and add to our assumptions that $q_{d}$ is the first repetition in the sequence, then we know that

Pumping for Non-Regularity

While the pumping lemma gives us a characterization about strings of regular languages it also provides us a way of determining when something is not regular, for example if we wanted to show that a language was not regular, we would assume for the sake of contradiction that it was regular and thus there would exists some $p$ such that any string of length greaer or equal to $p$ would have the three properties.

Thus we can obtain our contradiction by constructing a string of length greater than $p$ which does not satisfy all the conditions. Most of the time condition one provides the most flexibility, because if we choose our input string in a smart way, then ew cna raise $y$ to a power which creates a string which is no longer part of the input language. This is a common technique to prove that a language is not regular.

Pumping Lemma Choice of DFA

reg (A)

and any DFA

D

recognizing

A

with

p

states, such that then for any

s \in A

with

| s | \geq p

there exists

x, y, z

such that

s = x y z

where

for every $i \in ℕ_{0}$ we have $x y^{i} z \in A$
$| y | > 0$
$| x y | \leq p$

TODO: Add the proof here.

For Any k There Is a Binary Language and a DFA With K States That Recognizes It, but No DFA With K - 1 States That Recognizes It

For every

k \in ℕ_{2}

there exists a language

A_{k} \subseteq {0, 1}^{*}

and a DFA with

k

states that recognizes it, but no DFA with

k - 1

states that recognizes it.

Let $k \in ℕ_{2}$ and consider the language given by $A_{k} = {s \in {0, 1}^{*} : substr (0^{k - 1}, s)}$ that is the set of strings that have $0^{k - 1}$ as a substring.

Clearly there is a DFA with $k$ states which recognizes this language, the DFA constructed by having $k$ states all in a line, with the last state being an accept state, and chained together with arrows which allow movement when reading a $0$ , and sent back to the initial state whenever a $1$ is read, the only way to get to the end state is to read in $k - 1$ consecutive zeros, which is the exact condition on having $0^{k - 1}$ as a substring.

On the other hand if we assume for the sake of contradiction that there was a DFA C with $k - 1$ states which could recognize $A_{k}$ there will be a problem. Since $A_{k}$ is a subset of a regular language then it is regular, thus using the above corollary from the pumping lemma, since C is a DFA with $k - 1$ states that recognizes $A_{k}$ and $0^{k - 1}$ is a string with length greater or equal to $k - 1$ then by the pumping lemma there exists $x, y, z$ such that $0^{k - 1} = x y z$ , along with the three properties.

By the second property we know that $| y | \geq 0$ therefore $y = 0^{j}$ for some $j \in ℕ_{1}$ and that which implies $x z = 0^{k - 1 - j}$ simply because when all three are concatentated together we must get $0^{k - 1}$ , but then based on the first property we have that $x y^{i} z \in A_{k}$ for any $i \in ℕ_{0}$ so specifically if we choose $i = 0$ then we know that $x z \in A_{k}$ but this is a problem because $x z = 0^{k - 1 - j}$ and $j \geq 1$ so that $x z$ is a sequence of $m$ zeros where $m < k - 1$ but clearly $0^{k - 1}$ could not be a substring of this, therefore $x z \notin A_{k}$ , this is a contradiction, therefore no such DFA must exist.

TODO come back to the above later on

In english the above lemma is saying that whenever you have a regular language, there is a threshold wherein given a string in length exceeding that threshold then it can be decomposed in into three pieces such that the middle piece can be duplicated over and over, and the string remains part of that regular language (under two other small conditions).

Rotational Closure of a Language

RC (A) = {y x : x y \in A}

Intuitively this is allowing you to split an string in $A$ and then glue it the other way around.

The Rotational Closure Only Applies Once

RC (A) = RC (RC (A))

Note that since for any $x \in A$ we have $ϵ x \in A$ therefore $x ϵ = x \in RC (A)$ showing that $A \subseteq RC (A)$ , moreover that holds in general for any set $A$ , so therefore we have $RC (A) \subseteq RC (RC (A))$

Now suppose that $a \in RC (RC (A))$ and we want to prove that $a \in RC (A)$ . By assumption we know that if $a = y x$ then $x y \in RC (A)$ if that's true then we can split that string and glue it in reverse order and it would have to be an element of $A$ , there are a few cases, that is $x y \in RC (A)$ iff

$x y = x_{a} x_{b} y$ and $x_{b} y x_{a} \in A$
$x y = x y_{a} y_{b}$ and $y_{b} x y_{a} \in A$

If 1 holds true then we can "undo" it because

x_{a} x_{b} y = x y \in RC (A)

as needed, similarly if

2

holds true we can also say that

x y_{a} y_{b} \in RC (A)

in either case we've shown that

a = x y \in RC (A)

thus we have both inclusions so the sets are equal.

Regular Languages Are Closed Under Rotaitional Closure

reg (A) ⟹ reg (RC (A))

Since $reg (A)$ then there is a DFA D that recognizes $A$ , we first come up with an idea for a single string and show how to extend it to all strings.

Given the string $a \in RC (A)$ then we know that it means that $a = x y$ for some $x, y \in Σ^{*}$ where $x y \in A$ , since $L (D) = A$ then we know that $x y$ is accepted by $D$ , therefore after reading $x$ the DFA D will be at some state $q_{1}$ and then after reading $y$ it should end up in an accept state.

We construct an NFA by extending the DFA by running the DFA from $q_{1}$ (since a DFA is an NFA this is fine), once we get to the accepting state add an epsilon transition back to the start state of the DFA and continue running the logic of the DFA, if the NFA ends back at at state $q_{1}$ after reading $y$ this would only be true if $x y \in A = L (D)$ .

To generalize this to work for all possible strings we allow any state to take the place of $q_{1}$ , to allow the NFA to start and end at $q_{1}$ we duplicate our $D F A$ once for each state and add epslion transitions for each accepting state back to the start state, then to finish the generalization we have our start state and provide an epslion transition to all of the other states.

Context Free Grammar

A context-free grammar is a 4-tuple

(V, Σ, R, S)

sucht that

$V$ is a finite set called the variables
$Σ$ is a finite set, disjoint from $V$ called the terminals
$R$ is a tuple of the form $(v, Y)$ where $v \in V$ and $Y = (y_{1}, \dots, y_{n})$ is a sequence of elements from ${(V \cup Σ)}^{*}$ which is notated by $v \to y_{1} | y_{2} | \dots | y_{n}$
An $S \in V$ which is the start variable

Application of a Rule to a Variable

Suppose that

Y \in V \cup Σ

and a rule

R = Y \to y_{1} | y_{2} | \dots | y_{n}

, then we define

app (R, x) = {\begin{matrix} {y_{1}, \dots, y_{n}, x} & if x = Y \\ {x} & otherwise \end{matrix}

Note that if a rule matches we also allow for app, to not do anything by adding in $x$ to the set it evaluates to.

$A \to 0 A 1$
$A \to B$
$B \to #$

One Layer Productions of a Rule

Suppose

X \in {(V \cup Σ)}^{*}

where

X = (x_{1}, \dots, x_{k})

and we have a rule

R \to y_{1} | y_{2} | \dots | y_{n}

, we define

olprod (R, X) = app (R, x_{i}) \times \dots \times app (R, x_{k})

and note that

olprod (R, X) : {(V \cup Σ)}^{*} \to {(V \cup Σ)}^{*}

Note that we have $A B C D \in olprod (X \to A | B | C | D, X X X X)$ , but sometimes we want to only allow one rule to be applied at a time.

The above definition captures all the possible strings that you can produce given a string in your CFG and then applying a specific rule. We now overload the noatation to show all possible productions using all rules for a specific string:

One Layer Productions for All Rules

Suppose

X \subseteq {(V \cup Σ)}^{*}

with rules

R_{1}, \dots, R_{n}

, then we define

olprod ({R_{1}, \dots, R_{n}}, X) = ⋃_{i = 1}^{n} olprod (R_{i}, X)

Now we have to generalize this definition to be able to recursively apply onto a string, and not just one layer

Productions of a String

Suppose

X \subseteq {(V \cup Σ)}^{*}

for a CFG with rules

R

, then we define via function composition

prod (X) = ⋃_{i = 1}^{\infty} {olprod}^{\circ i} (X, R)

Language of a Context Free Grammar

lang (C) = prod (S) \cap Σ^{*}

Note that this implies that for any $x \in lang (C)$ there is at least one sequence of rules $R_{1}, \dots, R_{k}$ such that $x \in olprod (R_{k}, olprod (R_{k - 1} \dots olprod (R_{1}, S)), \dots)$

Note that sometimes productions can result in strings which are not all terminal symbols, therefore a language of a CFG are those productions which are entirely terminal symbols.

Note that sometimes we will want to be able to count the number of steps required to derive a string from a CFG, in that case it can be hard to deduce how many steps would be required to make the derivation because so many characters could change in of the one layer productions, so we instroduce the yields of a rule, which only allow one character to change

One Layer Yields of a Rule

ylds (R, x) = {y \in olprod (R, x) \land hamdist (x, y) = 1}

Note that we may use $ylds$ instead of $olprod$ and receive the same langauge.

Derivation

A derivation is a string that is produced by a sequence of yields.

A Language Is Context Free

We say that a given language

L

is context free if there is some CFG C such that

lang (C) = L

and we write

cf (L)

Context Free Languages Are Closed Under Union

cf (A) \land cf (B) ⟹ cf (A \cup B)

Since $cf (A)$ and $cf (B)$ , then we have CFGs $C_{A}$ and $C_{B}$ such that $lang (C_{A}) = A$ and $lang (C_{B}) = B$ . We construct a new CFG $C$ such that $lang (C) = A \cup B$ , we do so by first differentiating varibles in each grammar for $A$ and $B$ , we do so by subscripting each variable with $A, B$ respectively, and then setting our variables as the union of the the variables in both, and also unioning the rules in both, then we tack on one rule which is $S \to S_{A} | S_{B}$ .

proof that this actually generates what we expect.

Star Is Context Free

Suppose that

A

is finite, then

A^{*}

is context free.

We define the following CFG C

$V = {S}$
$Σ = A$
Since $A$ is finite, then $A = {a_{1}, a_{2}, \dots, a_{k}}$ and then we define the single rule: $S \to a_{1} S | a_{2} S | \dots | a_{k} S | ϵ$

Now given any element $x \in A^{*}$ then $x = (a_{f} (1), \dots a_{f} (j))$ for some function $f$ and some $j$ thus by sequentially picking the associated rule of the form $S \to a_{f} (i)$ , then then using the epsilon rule, we've shown that $A^{*} \subseteq lang (C)$

The other direction is obvious since $lang (C) = prod (S) \cap A^{*} \subseteq A^{*}$

Context Free Grammar for the Language of Binary Strings Whcich Start and End With the Same Character

Construct a CFG C such that

lang (C) = {w \in {0, 1}^{*} : s [0] = s [- 1]}

Note that

0, 1

are in the above language

We construct

C

$Σ = {0, 1}$
We take inspiration from the CFG which generates the star, and introduce one extra rule

$S \to 0 X 0 | 1 X 1 | 0 | 1$
$X \to 0 X | 1 X | ϵ$

We claim that $lang (C)$ is the desired set, so suppose that $s$ is a string in the desired set, and if it is any of ${0, 1}$ it is immediately in the generated langugae by rule 1. Otherwise it is a string of the form $0 z 0$ or $1 z 1$ , where $z \in {0, 1}^{*}$ and we proved previously that that rule $X \to \dots$ alone will generate ${0, 1}^{*}$ then we know that the second rule will match $z$ as needed. We've just shown that any string from the desired set is in the language.

Now if we instead suppose that we have any $x \in lang (C)$ , then we have to show that its in the desired set, since $lang (C) = prod (S) \cap {0, 1}^{*}$ then any element in there is produced by starting with the first rule and then any sequence of rules, but rule one, automatically guarentees that the first and last character are the same as any subsequent rules cannot modify the first and last character of the string so therefore $lang (C)$ is a subset of the desired set, showing that the two sets are equal as required.

For Every Dfa There Is an Equivalent Cfg

Suppose that

D

is a DFA, then there exists a CFG

C

such that

lang (D) = lang (C)

Make a variable $R_{i}$ for each state $q_{i}$ in the DFA. Add the rule $R_{i} \to a R_{j}$ to the CFG if $δ (q_{i}, a) = q_{j}$ is a transition in the DFA. Add the rule $R_{i} \to ϵ$ if $q_{i}$ is an accept state of the DFA. Make $R_{0}$ the start variable of the grammaer, where $q_{0}$ is the start state of the machine.

Chomsky Normal Form

A CFG is in Chomsky normal form if every rule is of the form

$A \to B C$
$A \to a$
$S \to ϵ$

Where

A, B, C \in V ∖ {S}

and

a \in Σ

, and we write

cnf (C)

Every Context Free Language is Equivalent to a Context Free Language in Chomsky Normal Form

For every CFG

C

there exists a CFG

B

such that

cnf (B)

and

lang (C) = lang (B)

We lay out an iterative process which constructs a new grammar in Chomsky Normal form as follows:

We first define our set of rules $R_{0} = R \times {T, F}$ to be the original set of rules, where $T$ means we've processed the rule and $F$ means we haven't. We first add a new start variable $S_{0}$ and the rule $S_{0} \to S$ , ie $R_{1} = R_{0} \cup {((S_{0}, S), T)}$ where $S$ was the original start variable.

Now for each element in $R_{i}$ of the form $((v \to ϵ), F)$ , then we get all rules of the form $((v_{2} \to A), F)$ where $v \in A$ then we add the rules $olprod ((v \to ϵ), A)$ and remove $((v, ϵ), F)$ . On each iteration the number of epsilon rules goes down by one, thus after a finite number of iterations there will be no more epsilon rules of the form $(v \to ϵ, F)$ although there may be epsilon rules of the form $((v \to ϵ), T)$ which is ok.

Next we remove all rules of the form $(v \to w, F)$ which are rules which map a variable to a variable, we do so by finding all rules of the form $(w \to W,_)$ and add the rule $(v \to W)$ , unless this rule was previoulsy removed (TODO define the removed set)

Finally we convert all the remaining rules to the proper form, the remaining possibilities of rules that need conversion are rules

$v \to W$ where $W \in {(V \cup Σ)}^{*}, | W | \geq 3$
$v \to x y$ where $x, y \in Σ$

For the first case since $W = (w_{1}, \dots, w_{k})$ where $k \in ℕ_{3}$

For the second case we just replce the terminals with rules pointint to them.

Chomsky Conversion

Consider the following grammar:

$A \to B A B | B | ϵ$
$B \to 00 | ϵ$

find an equivalent grammar in Chomsky Normal form.

Our initial rules are:

$A \to A B A$
$A \to B$
$A \to ϵ$
$B \to 00$
$B \to ϵ$

we start by removing

B \to ϵ

based on the conversion algorithm tue rules of interested are

A \to B, A \to A B A

, now we have (note that

olprod

are all the "one layer" productions of a rule against a string)

olprod (B \to ϵ, B) = {B, ϵ}

and

olprod (B \to ϵ, A B A) = {A B A, A A}

Thus the rule

A \to B

becomes

A \to B | ϵ

and the rule

A \to A B A

becomes

A \to A B A | A A

thus the current set of rules are

$A \to A B A$
$A \to B$
$A \to ϵ$
$A \to A A$
$B \to 00$

Now we remove the rule $A \to ϵ$ the rules of interest are $A \to A B A$ and $A \to A A$ we have: $olprod (A \to ϵ, A B A) = {A B A, B A, A B, B}$ and $olprod (A \to ϵ, A A) = {A A, A, ϵ}$ so the current set of rules becomes

$A \to A B A$
$A \to A B$
$A \to B A$
$A \to B$
$A \to A A$
$B \to 00$

note that the rule

A \to A

was not added as it does nothing, and the rule

A \to ϵ

was not added as it was removed previously

Since all epsilon rules have been eliminated we start remove variable to variable rules, we start with the rule $A \to B$ the only rule of interest is $B \to 00$ then our rules become

$S_{0} \to A | ϵ$
$A \to B A B | B A | A B | 00 | B B$
$B \to 00$

Another unit rule is $S_{0} \to A$ , the rules thus this generates

$S_{0} \to B A B | B A | A B | 00 | B B | ϵ$
$A \to B A B | B A | A B | 00 | B B$
$B \to 00$

Since $B \to 00$ is of the form $v \to x y$ where $x, y \in Σ$ then we replace it with the rules

$S_{0} \to B A B | B A | A B | 00 | B B | ϵ$
$A \to B A B | B A | A B | 00 | B B$
$B \to U U$
$U \to 0$

Now the last two rules that need simplification are $S_{0} \to B A B$ and $A \to B A B$ , as per the conversion procedure we replace the rule $S_{0} \to B A B$ with $S_{0} \to B W_{1}$ $W_{1} \to A B$ , and similarly we replace the rule $A \to B A B$ with $A \to B W_{2}$ and $W_{2} \to A B$ , but note that the rules for $W_{1}$ and $W_{2}$ are the same, and thus can be joined into the single rule $W_{1} \to A B$ yielding

$S_{0} \to B W_{1} | B A | A B | U U | B B | ϵ$
$A \to B W_{1} | B A | A B | U U | B B$
$W_{1} \to A B$
$B \to U U$
$U \to 0$

For Any Context Free Grammar in Normal Form There Are Exactly 2n - 1 Steps Are Required for Any Derivation

G

is a CFG in Chomsky normal form, then for any

w \in lang (G)

of length

n \in ℕ_{1}

, exactly

2 n - 1

steps are required for any derivation of

w

Since $G$ is in normal form, then there may be a rule of $S_{0} \to ϵ$ , if that's the case then this rule could only ever derive a string of length zero, that is $ϵ$ , since we only care about strings of length greater or equal to one, then that implies that the rule $S_{0} \to ϵ$ can never be used to deduce a string of length greater or equal to one.

Thus only rules used must be of the form $v \to v_{1} v_{2}$ or $v \to t$ where $t \in Σ$ . Note that every rule of the form $v \to t$ will never modify the length of a string of an intermediate string, and the rule $v \to v_{1} v_{2}$ will always extend the length of the intermediate string by $1$ each time it is applied.

Thus if we are able to derive a string of length $n$ and a derivation always starts with a single variable $S_{0}$ then a rule of the form $v \to v_{1} v_{2}$ must be used $n - 1$ times to have a string of length $n$ , additionally since the rule of the form $v \to v_{1} v_{2}$ only introduces variables and never any terminal symbols, then we will require the rule $v \to t$ to be used $n$ times to convert each of the $n$ variables produced by $v \to v_{1} v_{2}$ , thus there will be exactly $2 n - 1$ steps in any derivation for a string of length $n$ .

Pushdown Automaton

A pushdown automaton is a 6-tuple

Q, Σ, Γ, δ, q_{0}, F

where

Q, Σ, Γ

and

F

are all finite sets and

$Q$ is the set of states
$Σ$ is input alphabet
$Γ$ is the stack alphabet
$δ : Q \times Σ_{ϵ} \times Γ_{ϵ} \to 𝒫 (Q \times Γ_{ϵ})$ is the transition function
$q_{0} \in Q$ is the start state
$F \subseteq Q$ is the set of accept states

A Pushdown Automata Accepts an Input

Given an input

w = (w_{1}, \dots, w_{m})

where

w_{i} \in Σ_{ϵ}

we say that the push down automata

P

accepts

w

if there exists a sequence of states

r_{0}, \dots r_{m} \in Q

and stack history strings

h_{0}, h_{1}, \dots h_{m} \in Γ^{*}

such that:

$r_{0} = q_{0}$ and $s_{0} = ϵ$
$i \in {0, \dots, m - 1}$ we have $(r_{i + 1}, b) \in δ (r_{i}, w_{i + 1}, a)$ where $s_{i} = a t$ and $s_{i + 1} = b t$ for some $a, b \in Γ_{ϵ}$ and $t \in Γ^{*}$
$r_{m} \in F$

We define the following notation for the transition function of a pushdown automata:

$a, b \to c$ : the machine may make this transition by reading $a$ , popping $b$ from the stack and push $c$ to the stack
$a, ϵ \to c$ : the machine may make this transition by reading $a$ and pushing $c$ to the stack
$a, b \to ϵ$ : the machine may make this transition by reading $a$ popping $b$ from the stack
$a, ϵ \to ϵ$ : the machine may make this transition by reading $a$ and not doing anything to the stack
$ϵ, b \to c$ : the machine may make this transition by doing the above logic, without having to read $a$ from the input.

If you are at a state, and one of the outgoing rules is $a, b \to c$ , then if $b \neq ϵ$ and the top of the stack is not $b$ then this rule cannot be used. If $b = ϵ$ , then the rule can be used.

Similar to an NFA, when we write out a PDA, if there are no arrows for a particular, state, character, stack character pair, then this means that it maps to the empty set. If you get to a point where there are no possible outgoing rules to be used, then the PDA is said to be stuck, and that line of execution terminates.

Push Down Automata That Models the Language of Binary Strings Which Start and End With the Same Character

As per title.

The above PDA works because the bottom path will accept the string $0, 1$ , and going through the top path reads the first character and pushes it into the stack, then non-determinism kicks in and it will read some number of ccharacters from the input without pushing anything to the stack, in one of paths of non-determinism there will be one where all but one character is read from the input, then the last transition will read a character and attempt to pop off a that character from the stack, if that occurs it ends up in the terminal state, if there were any more characters to read, then since there are no subsequent states the path would terminate, thus this implies that the only way a string could get accepted by this PDA if it starts and ends with the same character.

Note that we could also model this "termination behavior" discussed above by pushing the dollar sign on, and then having a trasition that takes it off the stack.

Pushdown Automata That Models the Language of Binary Strings of Odd Length With a Zero in the Middle

As per title.

The second half of this PDA reads an input, and removes whatever character is on the top of the stack over and over non-deterministically, since there are no outgoing transitions from the terminal state, then a string is only accepted if it is entirely read by the non-determinism, additionally the fact that there is a dollar sign transition to the last state means that the stack must also be emptied by the time we get to the state before the terminal state.

If less than half the string $⌊ \frac{| s |}{2} ⌋$ is read during the first part of the PDA, then the stack will become empty before all the characters are read, causing the string to not be accept, similarly if more than half the string is read during the first part of the PDA, then the stack will not have the dollar sign when we get to the terminal state.

Therefore if a string is accept it must be that exactly that half the string is read during the first part of the PDA. Now note that there is a transition which reads $0$ but does nothing to the stack, this implies that exactly half the characters must be read during the first half of the PDA, and then in the second half of the PDA it will read a character and pop a character off the stack, which will make sure that the number of characters in the first half will equal the number of characters in the second half, showing that it accepts all strings of odd length with $0$ in the middle.

The Half Zero Hash Sandwhiches Are Non-regular

The language

L = {0^{k} # 0^{2 k} : k \in ℕ_{0}}

is not regular

Suppose for the sake of contradiction that this language was regular, therefore there is some $p \in ℕ_{1}$ sucht that the properties hold true, if we consider the string $0^{p} # 0^{2 p} \in L$ then by the lemma we have that $0^{p} # 0^{2 p} = x y z$ .

By the third property we have that $| x y | < p$ since the first $p$ characters of our string are $p$ this implies that $x = 0^{j}$ and $y = 0^{l}$ by the second property of the lemma we also know that $l \geq 1$ now we also know that $# \in z$ therefore by the first property we can set $i = 0$ to show that $x y^{0} z = x z \in L$ , but $x y = 0^{p - l} # 0^{2 p} \notin L$ thus a contradiction so $L$ is not regular

A Cfg Which Is Not Regular

Let

G = (V, Σ, R, S)

with

V = {S, T, U}

and

Σ = {0, #}

be the following grammar defined by the following rules:

$S \to T T | U$
$T \to 0 T | T 0 | #$
$U \to 0 U 00 | #$

Show that

lang (G)

is not regular

First we give an informal description of the language, since the language of a CFG equals to $prod (S) \cap Σ^{*}$ , then by induction on the length of the string produced by the second rule, starting with length 1, we can prove that productions of this rule are given by all strings in $Σ^{*}$ such that the string contains exactly one $#$ .

Using a similar analysis the productions of the third rule in isolation are given by all strings in $Σ^{*}$ such that there is exactly one $#$ and if there are $n$ zeros to the left of $#$ then there are $2 n$ $#$ 's on the right.

Finally the initial rule creates the union of these two langauges. We now move on to showing that the language is not regular.

Since we assumed $lang (G)$ was regular, then we'll try and obtain a contradiction through the pumping route, we do this by using the closure rules for regular langauges to focus on an even more specific subset of the language which should also be regular.

Since $lang (G)$ is regular, and the regular expression $0^{*} # 0^{*}$ is also regular, then $A = lang (G) \cap 0^{*} # 0^{*} = {0^{k} # 0^{2 k} : k \in ℕ_{0}}$ is regular, but we know that it is not, thus a condtradiction, so then $lang (G)$ must not be regular.

If a Language Is Context Free Then There Is Some Pushdown Automaton That Models It

TODO: Add the proof here.

A Language Is Context Free Iff There Is Some Pushdown Automaton That Models It

Suppose that

L

is a language then

cf (L) ⟺ \exists P, lang (P) = L

TODO: Add the proof here.

Pumping for Context-free Languages

cf (A)

for some language

A

, then there is a number

p

where if

s \in A

and

| s | \geq p

then

s = w v x y z

such that

for each $i \geq 0, u v^{i} x y^{i} z \in A$
$| v y | > 0$
$| v x y | \leq p$

TODO: Add the proof here.

The Language of Abcs Is Not Context Free

The language

L = {a^{n} b^{n} c^{n} : n \in ℕ_{0}}

is not context free

Suppose that the language is context free, therefore the pumping lemma holds, and thus there is some $p$ with some properties. Consider $s = a^{p} b^{p} c^{p} \in L$ since $| s | \geq p$ then by the pumping lemma we can write $x = u v x y z$ , where at least one of $v$ or $y$ is non-empty.

If $v, y \in a^{*}, b^{*}, c^{*}$ which is to say the consist only of the same character, then $s^{'} = u v^{2} x y^{2} z$ makes the following false $count (s^{'}, a) = count (s^{'}, b) = count (s^{'}, c)$ thus $s^{'} \notin L$ .

If either $v$ or $y$ contain more than one alphabet character then suppose wlog that $v \in \dots a \dots b \dots$ , therefore $u v^{2} x y^{2} z \in \dots a \dots b \dots a \dots b \dots$ and we know that $\dots a \dots b \dots a \dots b \dots \cap L = \emptyset$ thus $s^{'} \notin L$

Thus no matter what we obtain a contradiction so it must be that $L$ is not context free.

Context Free Languages Are Not Closed Under Intersection

cf (A) \land cf (B)

does NOT imply that

cf (A \cap B)

The languages

L_{1} = {a^{n} b^{n} c^{m} : n, m \in ℕ_{0}}

and

L_{2} = {a^{m} b^{n} c^{n} : n \in ℕ_{0}}

are context free, but

L_{1} \cap L_{2} = {a^{n} b^{n} c^{n} : n \in ℕ_{0}}

shows that the intersection is not context free.

Zero One Zero One Language Is Not Context Free

Show that the language

L = {0^{n} 1^{n} 0^{n} 1^{n} : n \in ℕ_{0}}

is not context free

Suppose that the language is context free for contradiction, therefore the pumping lemma for context free languages holds true and we obtain some $p \in ℕ_{1}$ such that if a string is longer than it some properties hold true.

Now the string $s = 0^{p} 1^{p} 0^{p} 1^{p} \in L$ and clearly $| s | \geq p$ therefore $s = u v x y z$ , we know that $| v y | \geq 1$ so they are both non-empty. Its either the case $v, y \in 0^{*} \cup 1^{*}$ or not.

If $v, y \in 0^{*} \cup 1^{*}$ , then there are couple of similar cases to consider, specficially if $v \in 0^{*}$ and $y \in 1^{*}$ , and suppose that $v \neq ϵ$ so that $v = 0^{j}$ where $j \in ℕ_{1}$ f we consider $x = u v^{2} x y^{2} z = u 0^{2 j} x y^{2} z$ . Observe that in the given language $L$ for every $l \in L$ we have that $count (l, 0) = count (l, 1)$ , since $count (x, 1) \neq count (x, 0)$ then $x \notin L$ , therefore this is a contradiction.

Similarly if $v = 1^{j}$ we will obtain a symmetrical contradiction, for the case it is $y \neq ϵ$ an identical anlysis on $y$ holds.

If its not true that $v, y \in 0^{*} \cup 1^{*}$ then since not both of $v, y$ are empty then at least one of $v, y$ is an element of ${0, 1}^{*} ∖ {ϵ}$ , suppose it is $v$ , then $v = \dots 0 \dots 1 \dots \lor v = \dots 1 \dots 0 \dots$ since all strings in $L$ are a sequence of $0$ then a sequence of $1$ s then a sequence of $0$ s then a sequence of $1$ s, clearly $u v^{2} x y^{2} z \notin L$ as it contains more than four sequences of $0$ and $1$ s as $v^{2} = \dots 0 \dots 1 \dots 0 \dots 1$ or $v^{2} = \dots 1 \dots 0 \dots 1 \dots 0 \dots$ , if it is $y$ that is non-empty the same analysis applies, which is a contradiction.

Thus not matter the case of $v, y$ we always get a contradiction, therefore $L$ is not context free.

The Language of Palindromes With an Equal Number of 0s and 1s Is Not Context Free

L = {w \in {0, 1}^{*} : count (w, 0) = count (w, 1) \land rev (w) = w}

Suppose that $L$ is context free for the sake of contradiction, then the pumping lemma holds on this language and thus there is some $p \in ℕ_{1}$ , then we consider the string $s = 0^{p} 1^{2 p} 0^{p}$ which clearly has $| s | \geq p$ , thus $s = u v x y z$ .

We know that one of $v, y$ is non-empty, and they cannot both be empty, suppose that $v$ is empty and that $y$ is non-empty, then $y \in {0, 1}^{*} ∖ {ϵ}$ , if $count (y, 1) \neq count (y, 0)$ we will have a problem because the string $s^{'} = u v^{2} x y^{2} z$ has the property that $count (s^{'}, 0) \neq count (s^{'}, 1)$ thus $s^{'} \notin L$ . Therefore we must have that $count (y, 0) = count (y, 1)$ , since we assumed $v = ϵ$ then $s^{'} = u v^{2} x y^{2}$ no longer has the property that $rev (s^{'}) = s^{'}$ because it will change the string where the $0$ 's meet the $1$ 's.

Now suppose that both $v, y$ are non-empty,if $v$ resides on the left half of the string and $y$ resides on the right half of the string, then because $| v x y | \leq p$ then it must be that $v, y \in 1^{*} ∖ {ϵ}$ , thus this could yield a contradiction since $s^{'} = u v^{2} x y^{2} z$ has more $1$ 's than $0$ 's. Therefore, they most both be substrings of either the right or left hand side of the string, so without loss of generality asssume that they reside within the left side of the string. If $v, y \in 1^{*}$ then $u v^{2} x y^{2} z$ contains more 1s than 0s, which is a contradiction, if $v \in 0^{*} ∖ {ϵ}$ similarly if $v, y \in 0^{+}$ we get the same problem. If $01 \subseteq v$ then $u v^{2} x y^{2} z$ has two $01$ 's as substrings in the left side of the string whereas the right side only has one which is a contradiction.

Turing Machine

A turing machine is the following data:

$Q$ is the set of states
$Σ$ is the input alphabet not containing the blank symbol
$Γ$ is the tape alphaphet which contains the blank symbol and $Σ \subseteq Γ$
$q : Q \times$

Configuration of a Turing Machine

A setting of the current state, the current tape contents, and the current head location is called a configuration of the turing machine

Start Configuration

Given a Turing Machine

M

with an input

w

, then the start configuration is the configuration

q_{0} w

Accepting Configuration

Is one such that the state of the configuration is

q_{a}

Rejecting Configuration

The state of the configuration is

q_{r}

Halting Configuraiton

The state is an accepting configuration or a rejecting configuration

The machien is defined to halt when in the states $q_{a}$ or $q_{r}$

A Turing Machine Accepts an Input

Suppose that

M

is a turing machine and

w

is an input, then we say that

M

accepts

w

if a sequence of configurations

C_{1}, C_{2}, \dots, C_{k}

exist where

$C_{1}$ is the start configuration of $M$ on input $w$
each $C_{i}$ yields $C_{i + 1}$
$C_{k}$ is an accepting configuration

When this is true we write that

acc (M, w)

A Turing Machine Rejects an Input

The turing machine enters

q_{r}

A Turing Machine Loops on an Input

The turing machine never enters a halting configuration

Language of a Turning Machine

lang (M) = {w : acc (M, w)}

Turing Recognizable

Given a language

L

we say that it is Turing Recognizable if there is some Turing machine

M

such that

L = lang (M)

Decider Turing Machine

A decider is a turing machine that doesn't loop on any input

A Turing Machine Decides a Language

We say that a turing machine

M

decides a language

L

if it is recognized by a decider turing machine

Turing Decidable

We say that a language

L

is turing decidable if there is a turing machine that decides it.

Non-deterministic Turing Machine

A non determinstic turing machineeeeeeee is a turing machine that may proceed according to several possiblitiles, with a transition function of the form

δ : Q \times Γ \to 𝒫 (Q \times Γ \times {L, R})

Every Nondeterministic Turing Machine Has an Equivalent Determinstic Turing Machine

TODO: Add the content for the proposition here.

TODO: Add the proof here.

The Acceptance Set for Dfas

A_{D F A} = {⟨ B, w ⟩ : B is a DFA that accepts input string w}

The Acceptance Set for Dfas Is a Decidable Language

TODO: Add the content for the proposition here.

To do so we just have to construct a turing machine

M

which decides

A_{D F A}

A Language Is Turing Recognizable Iff It Is a Projection of a Decidable Language

Let

C

be a language. Prove that

C

is Turing-recognizable iff a dedcidable language

D

exists such that

C = {x : \exists y, ⟨ x, y ⟩ \in D}

Suppose that $C$ is Turing-recognizable, therefore there is some turing machine $M$ that recognizes $C$ so that $lang (M) = C$ which means that for any $x \in C$ that $M$ accepts and halts on input $x$ , for each input, suppose the number of steps taken by the turing machine before halting is given by $h_{x} \in ℕ_{1}$ , then we define the language $D = {⟨ x, y ⟩ : acc (M, x) in h_{x} steps}$ , this language is decidable. An element $x \in C$ iff $acc (M, x)$ in $h_{x}$ steps iff $⟨ x, h_{x} ⟩ \in D$ as needed.

Suppose that such a language $D$ exists, since it is decidable then it is clearly recognizable and thus some enumerator enumerates, if it enumerates $D$ , then if we drop the second component $y$ , we obtain an enumerator for $C$ , and therefore is Turing-recognizable.

An Enumerator Enumerates a Language

We say that an enumerator enumerates a language

L

if it prints out at least

L

A Language Is Turing Recognizable If and Only If Some Enumerator Enumerates It

TODO: Add the content for the proposition here.

TODO: Add the proof here.

Separating Language

Suppose that

A, B

are two disjoint languages, we say that a language

C

separates

A

and

B

A \subseteq C

and

B \subseteq \overline{C}

Co Turing Recognizable

We say that a language is co-Turing-recognizableif it is the complement of a Turing recognizable language

Any Two Disjoint Co Turing Recognizable Languages Are Separable by Some Decidable Language

Let

A

and

B

be two disjoint languages. Say that language

C

separates

A

and

B

A \subseteq C

and

B \subseteq C

. Show that any two disjoint co-Turing-recognizable languages are separable by some decidable language.

Since we assumed that $A$ and $B$ are co-Turing Recognizable that means that $A$ and $B$ are recognizable.

Since a language is turing recognizable iff it is enumerable by some enumerator then there exist enumerators $E_{A}, E_{B}$ that enumerate $A, B$ respectively.

We will now construct a turing machine $M$ and if we can make it a decider then we'll use $C = lang (M)$ .

Consider the turing machine such taht given an input $w$ it will run $E_{A}$ and $E_{B}$ in parallell, if $E_{A}$ prints $w$ then we reject the input ; if $E_{B}$ prints $w$ accept.

Now we must show that $A \subseteq C$ so suppose that $w \in A$ then since $A \cap B = \emptyset$ then we know that $w \notin B$ , that is $w \in B$ and $w \notin A$ , thus $w$ will eventually get printed out only by $E_{B}$ so the machine will accept, so $w \in C$

We also have to show that $B \subseteq C$ so let $w \in B$ again, $A, B$ are disjoint so this implies that $w \notin A$ that is $w \in A$ and we also know $w \notin B$ thus $w$ will eventually get printed by only $E_{A}$ so the machine will reject, and so $w \notin C$ , so $w \in C$ as needed.

If $w \notin (A \cup B)$ then $w \in A \cap B$ therefore $E_{A}$ and $E_{B}$ will print it in finitely many steps, thus our turing machine will halt. If $E_{A}$ prints it first, then reject, otherwise if its $E_{B}$ then accept

Thus we've shown that $M$ halts on all inputs and thus is a decider, since we showed that $A \subseteq C$ and $B \subseteq C$ then it's separated by a decidable language.

Turing Diagonalization

Let A be a Turing-recognizable language consisting of descriptions of Turing machines,

{⟨ M_{1} ⟩, ⟨ M_{2} ⟩, \dots .}

, where every

M_{i}

is a decider. Prove that some decidable language

D

is not decided by any decider

M_{i}

whose description appears in

A

If $A$ was finite, then since there are infinitely many decidable languages, then at least one of them must be decided by a machine that is not in $A$ , therefore assume that $A$ is infinite, since $A$ is turing recognizable, then we know that its enumerable by some enumerator $E$ , now we define the following turing maachine

Let $M$ be the turing machine operating on positive integers such that given the input $⟨ n ⟩$ we run the enumerator until we have printed out $n$ unique turing machines, say $M_{1}, \dots, M_{n}$ , then run $M_{n}$ with the input $⟨ n ⟩$ and do the opposite of whatever $M_{n}$ outputs.

$M$ is a decider, this is because $E$ must enumerate all of the elements in a finite number of steps and because every turing machine encoded in $A$ was a decider, so nothing will loop forever in the above definition.

Since we have inverted the behavior for each $⟨ i ⟩$ then we have induced a diagonalization argument, this is because we have inverted the behavior of the selected machine meaning that $⟨ i ⟩ \in lang (M)$ XOR $⟨ i ⟩ \in lang (M_{i})$ must hold true, this shows that $M_{i}$ cannot decide $lang (M)$ for each $i$ , and therefore no turing machine encoded in $A$ can, on the other hand $M$ decides that language, as needed.

Computable Function

A function

f : Σ^{*} \to Σ^{*}

is a computable function if some Turing machine

M

such that for every input

w

the TM

M

halts with just

f (w)

on its tape

Mapping Reducible

A language

A

is said to be mapping reducible to a language

B

written as

A \leq_{m} B

if there is a computable function

f : Σ^{*} \to Σ^{*}

where for every

w

w \in A ⟺ f (w) \in B

the function

f

is called the reduction from

A

B

If a Language Is Mapping Reducible to a Regular Language Then It Might Not Be Reducible

Suppose that

A \leq_{m} B

and

B

is regular, then it doesn't imply that

A

is a regular language.

If we consider the language $A = {0^{n} 1^{n} : n \in ℕ_{0}}$ which is not a regular language as seen earlier. We will show that $A$ is mapping reducible to $B = {11}$ , also note that this finite language is clearly regular.

So we have to show that there is a computable function $f$ such that $w \in A ⟺ f (w) = 11$ . If we define the function $f (w) = {\begin{matrix} 11 & if w \in A \\ 00 & otherwise \end{matrix}$ then it satsfies the requirement that $w \in A ⟺ f (w) \in B$

This function is computable because we can create a turing machine $M$ that uses a stack to recognize strings from $A$ as they are of the form $0^{n} 1^{n}$ , whenever it recognizes a string then it writes $11$ to the tape, and outputs $00$ if it rejects the input string $w$ , thus this function is computable. Additionally this function halts on all inputs, as for every finite string by the time we get to the end of the string in finitely many steps the stack is observed and a decision is made.

Thus we've shown that $A \leq_{m} B$ where $B$ is regular, but $A$ is not.

5.9

The Encodings of Reversal Accepting Turing Machines Is Undecidable

Show that the collection

R

of encodings of turing machines that accept a string

rev (w)

whenever it accepts

w

is undecidable.

Suppose for the sake of contradiction that $R$ was decidable, we'll get a contradiction by showing that $A_{T M}$ is decidable.

Since $R$ is decidable, then there exists some decider $D$ for the language $R$ , we'll now construct a decider for $A_{T M}$ to obtain our contradiction.

We will do this by constructing a decider for $A_{T M}$ by using the following idea, given $⟨ M, w ⟩$ we only accept if $D$ accepts $M^{'}$ where $M^{'}$ is a turing machine that we will define during an intermediate phase, $M^{'}$ will have the property that $M$ accepts $w$ if and only if $⟨ M^{'} ⟩ \in R$ .

If all that is set in place, then we have $⟨ M, w ⟩$ is accepted by $A$ iff $D$ accepts $M :^{'}$ iff $M$ accepts $w$ , that is $A$ accepts $⟨ M, W ⟩$ iff $M$ accepts $w$ .

Now that we have the idea, we can formally construct $A$ based on the specification of $R$ .

Given $⟨ M, w ⟩$
Construct $M^{'}$ as follows

on input $y$

if $y \in 0^{+} 1^{+}$ accept
if $y \notin 0^{+} 1^{+}$

run $M$ on the input $w$ if it accepts, then accept, otherwise reject

Run $D$ on the input $⟨ M^{'} ⟩$ and output what it outputs

Note that $M^{'}$ is a decider, this is because the construction of $M^{'}$ takes finitely many steps as we are not actually running anything, just constructing, then when we run the encoding of this machine on $D$ since it is a decider then it is guarenteed to halt in finitely many steps.

The language $0^{+} 1^{+}$ seems arbitrary, but it was simply chosen as it is one of the many languages that thas the property that that there exists some string $x$ in the language such that $rev (x)$ is not in in the language, for this example we can see that the string $001$ is in the language but $100$ is not. Keep in mind we could have used any language that satifies this property.

Thus if we have any turing machine $T$ such that $lang (T) = 0^{+} 1^{+}$ then we know that $⟨ T ⟩ \notin R$ , while at the same time if $lang (T) = {0, 1}^{*}$ then clearly $T \in R$ .

Therefore if we observe our definition of $A$ we can see that if $M$ accepts $w$ then $lang (M^{'}) = {0, 1}^{*}$ so then $⟨ M^{'} ⟩ \in R$ . and also if $M$ does not accept $w$ then $lang (m^{'}) = 0^{+} 1^{+}$ so that implies that $⟨ M^{'} ⟩ \notin R$ , thus $M$ accepts $w$ iff $⟨ M^{'} ⟩ \in R$ .

Since we have a decider $D$ for $R$ then we just have to run $D$ with the input $⟨ M^{'} ⟩$ . Therefore we've constructed a decider for $A_{T M}$ which is a contradiction, therefore it must not be true that $R$ is decidable, therefore it is undecidable.

The Busy Beaver Function Is Not Computable

Let

Γ = {0, 1,_}

be the tape alphabet, and define the busy beaver function

B B : ℕ \to ℕ

as follows. For each value of k, consider all k-state TMs that halt when started with a blank tape. Let

B B (k)

be the maximum number of 1s that remain on the tape among all of these machines. Show that

B B

is not a computable function.

Suppose for the sake of contradiction that $B B$ was a computable function, that is there is some Turing machine $M$ such that on every input $w$ it halts with just $B B (w)$ on its tape.Specifically we'll assume without loss of generality that it does so in unary, so we have some turing machien $M$ such that whenever the input is $1^{n}$ then $M$ halts with $1^{B B (n)}$ on the tape.

We'll now observe a special TM $M^{'}$ that will yield a contradiction, this turing machine will fit those turing machines that the busy beaver function talks about, and so it will be a turing machine that halts when started on a blank tape. It will write $n$ ones to the tape, it then doubles the number of ones and then simulates the turing machine $M$ on what is currently on the tape ( $1^{2 n}$ ), and therefore will halt with $B B (2 n)$ ones on the tape.

Next we can show that $B B$ is a strictly increasing function, this is because for every $k$ -state turing machine that halts when started on a blank tape, there is an analagous $k + 1$ -state turing machine that also halts when started on tha blank tape and simulates the other machine with one extra state which does nothing, this shows us that $B B (k) \leq B B (k + 1)$ , also if we use that extra state to write another one to the tape, then we have that $B B (k) < B B (k + 1)$ .

$M^{'}$ takes $n$ states to write each $1$ to the tape initially, but doubling the ones on the tape only requires a constant number of states to do, and simulating $M$ only takes a finite number of states, therefore $M_{n}^{'}$ would have $n + k$ -states. $M^{'}$ is a turing machine that halts when started with an empty tape, and when it terminates it has $B B (2 n)$ ones on the tape, since $B B (n + k)$ is the maximum number of ones over all turing machines with $n + k$ states that terminate when started on an empty tape, then we can conclude that $B B (n + k) \geq B B (2 n)$ .

Since $k$ is some constant, then once $n > k$ then we have that $n + k < 2 n$ and since we concluded that $B B$ was a strictly increasing function then this would imply that $B B (n + k) < B B (2 n)$ which is a contradiction, therefore our assumption that $B B$ was computable is false.

5.16

A Language Is Turing Recognizable Iff It Is Mapping Reducible to the Acceptance Set for Turing Machines

A

is Turing recognizable iff

A \leq_{m} A_{T M}

Suppose that $A$ is turing-recognizable and and so there exists a turing machine $M$ that recognizes A.If we consider the function $f (w) = ⟨ M, w ⟩$ , clearly there is a turing machine such that for every $w$ it halts with just $⟨ M, w ⟩$ on its tape because all it would have to do is paste the encoding of $M$ and then $w$ .

Moreover we know $w \in A$ iff $w \in lang (M)$ iff $⟨ M, w ⟩ \in A_{T M}$ iff $f (w) \in A_{T M}$ , that is to say that $w \in A ⟺ f (w) \in A_{T M}$ showing that $A$ is mapping reducible to $A_{T M}$

Now we show the other direction, where we assume that $A \leq_{m} A_{T M}$ , since $A_{T M}$ is Turing recognizable by the Universal Turing machine (the turing machine that just simulates other turing machines), then that implies that $A$ is turing recognizable, as needed.

A Language Is Decidable If and Only If It Is Mapping Reducible to Zeros Then Ones

A

is decidable if and only if

A \leq_{m} 0^{*} 1^{*}

$⟹$ Suppose that $A$ is decidable, therefore there is some turing machine $D$ such that given some $w$ it accepts if $w \in A$ and rejects if $w \notin A$ , now if we make a simple wrapper turing machine $R$ that simply takes in an input simulates $w$ on $D$ , and if $D$ accepts output $01$ and if it rejects output $10$ , then this creates a computable function $f (x)$ such that $f (x) = 01$ if $x \in A$ and $f (x) = 10$ if $x \notin A$ .

Since $ran (f) = {01, 10}$ then we can say that $x \in A ⟺ f (x) \in 0^{*} 1^{*}$ , therefore we have $A \leq_{m} 0^{*} 1^{*}$

$⟸$ Now we assume that $A \leq_{m} 0^{*} 1^{*}$ , since we already discovered that regexs are decidable, then $0^{*} 1^{*}$ is decidable, and thus $A$ is decidable.

Prefixing the Acceptance Set for Turing Machines Makes It Not Turing Recognizable

Consider the set:

J = {w : w = 0 x f.s. x \in A_{T M}} \cup {w : w = 1 x f.s. x \in A_{T M}}

Show that

J

nor

J

are Turing recognizable

We claim that $A_{T M}$ is mapping reducible to $J$ by the function $f (w) = 1 w$ . The reason this is so is $f$ is computable as there is a certainly a turing that can append $1$ to the front of an input string onto the tape, additonally if $w \in A_{T M}$ then $f (w) = 1 w \in J$ , and if $1 w \in J$ it implies that $1 w \in {y : y = 1 w f.s. y \in A_{T M}}$ therefore $w \in A_{T M}$ , so we have $A_{T M} <_{m} J$ , since $A_{T M}$ is not turing recognizable, then neither is $J$ .

We also claim that $A_{T M}$ is mapping reducible to $J$ , if we use the function $f (w) = 0 w$ the details are similar to the above in that $f$ is computable, and also in that if $w \in A_{T M}$ , then $f (w) = 0 w \in J$ , and that if $0 w \in J$ then it must be that $w \in J$ , thus we have $A_{T M} \leq_{m} J$ and since $A_{T M}$ is not turing recognizable, then neither is $J$

Acceptance Set for Context Free Grammars

A_{C F G} = {⟨ G, w ⟩ : G is a CFG that generates the string w}

The Acceptance Set for Context Free Grammars Is Decidable

A_{C F G}

is decidable

TODO: Add the proof here.

The Set of Equivalent Context Free Grammars

E Q_{C F G} = {⟨ G, H ⟩ : G, H are CFG’s and lang (G) = lang (H)}

The Set of Equivalent Context Free Grammars Is Co Turing Recognizable

As per title.

In order to show that this is true we have to show that the complement of

E Q_{C F G}

is turing recognizable. So we need a turing machine that can recognize the language

E Q_{C F G} = {⟨ G, H ⟩ : G, H are CFG’s and lang (G) \neq lang (H)}

In order to know if the two languages differ we just need to find an example of a string which is in one, but not the other, we also note that the set of strings over a language is countable, meaning that they can be enumerated by

s_{1}, s_{2}, \dots

now we create a recognizer as follows

on input $⟨ G, H ⟩$ :

for each string $s_{1}, s_{2}, \dots$
since $A_{C F G}$ is decidable there is a decider $D$ for it, so we can run $⟨ G, s_{i} ⟩$ on $D$ and $⟨ H, s_{i} ⟩$ on $D$ and compare the results, if they are the same go to the next iteration, otherwise if they are different accept

We claim that this is a recognizer, this is because $⟨ G, H ⟩ \in E Q_{C F G}$ if and only if there exists some $i \in ℕ$ $s_{i}$ such that $⟨ G, s_{i} ⟩ \in D$ and $⟨ H, s_{i} ⟩ \notin D$ (or vise versa), iff the above turing machine terminates in the accept state after $i$ iterations.

Therefore the above turing machine terminates in the accept state if and only iff $⟨ G, H ⟩ \in E Q_{C F G}$ therefore it is a recognizer.

There Is an Undecidable Subset of the Kleene Star of One

There exists a subset

S \subseteq 1^{*}

that is undecidable

Instead of doing a turing machine argument we do it by counting, first of all we show that $𝒫 (1^{*})$ is in bijection with ${0, 1}^{\infty}$ to do this, we consider any subset of $1^{*}$ which is of the form $S = {s_{1}, s_{2}, \dots}$ where each $s_{i} \in 1^{*}$ each $s_{i}$ has a length, that is it is a sequenece of $n_{i}$ ones in a row, thus for each $s_{i}$ wherein we have $| s_{i} | = n_{i}$ we set the $n_{i}$ th element of the infinite binary string to $1$ and the rest to zero, this sets up the mapping.

Clearly this function can hit any output, so given any infinite binary string, we just select the elements in $𝒫 (1^{*})$ of corresponding length. Additionally given any collection of subsets we know which string will hit it, all that remains is injectivity which is simple to prove, and we will not go into that detail.

Thus we have a bijection from $𝒫 (1^{*})$ to ${0, 1}^{\infty}$ therefore since we already know that the latter is uncountable, then so is the former.

Since the set of all turing machines is a subset of all possible strings which is countable, then the number of possible turing machines is countable, and thus the number of languages that are recgonized is also countable, therefore there is some element of $𝒫 (1^{*})$ which is not recognized by some turing machine, select that one, this is our example of a subset of $1^{*}$ which is undecidable, because we already know it is unrecognizable.

Fixed Point of the Accept and Reject Switching Function for Turing Machines

In the fixed point version of the recusion theorem, let the transformation

t

be a function that interchanges the states

q_{a c c e p t}

and

q_{r e j e c t}

in turing machine descriptions, find a fixed point for

t

Recall that the fixed point is a turing machine which when fed through $t$ yields a turing machine equivalent to the original. Suppose we had such a machine $⟨ M ⟩$ and lets see the restrictions on this machine, if we gave this machine some input $w$ and it halts, then this must be because it either entered the accept or reject state, since $t (⟨ M ⟩)$ inverts this behavior, then on this input the two machines would differ (one accept and one reject), thus if such a fixed point machine were to exist, it must be that it never halts.

If $M$ is a turing machine that loops forever, then by definition this means that it never enters the accept or reject state, therefore if we consider $t (⟨ M ⟩)$ then this is still a machine that loops forever on all inputs, thus the language of the two machines is the same (empty) and they are equivalent, thus any machine that loops on all possible inputs is a fixed point.

For Any Two Languages There Is Always Another Language That Is Is Turing Reducible To

As per title.

Something that works well here is embedding both languages into the following set $J = 0 A ⊔ 1 B$ . The reason we use such a set is that we know that $w \in A ⟺ 0 w \in J$ and that $w \in B ⟺ 1 w \in J$ Now we will start by proving that $A \leq_{T} J$ , to do this we assume that we have an oracle $O$ for $J$ , and now we construct a decider for $A$ , to do this we construct a turing machine as follows:

on input $w$ we query the orcale if $0 w$ is an element of $J$ , if it is we accept, and otherwise reject

As mentioned above $0 w \in J ⟺ w \in A$ thus the above turing machine accepts if $w \in A$ and rejects if $w \notin A$ , that is it is a decider for $A$

The exact same construction where we query the oracle with $1 w$ decides $B$

Computing the Descriptive Complexity With an Oricale for the Acceptance Set for Turing Machines

As per title.

We need to come up with an algorithm which will compute the descritpive complexity of a binary string $x$ , we'll first introduce the idea, and then refine it so to actually work correctly. A naive approach would be to simply go through all binary strings in lexographic order, if we were on string $s$ we would then try all possible ways of parsing it into the form $⟨ M, w ⟩$ where $M$ is a turing machine $w$ an arbitrary input string, if no parsing works for string $s$ then we would try the next string.

If it so happened that $s$ was parsable into $⟨ M, w ⟩$ then our goal would be to check if $M$ on input $w$ halts with $x$ on its tape, and we would return $| s |$ , this would work because lexiographic order forces shorter strings to come before longer strings, thus enforcing the two requirements that descriptive complexitiby be as small as possible and in the case of ties choosing the one that comes first in the lexicographic ordering.

The one problem with our plan is that running $M$ on input $w$ may never halt, as its just a regular turing machine, and so we need a way to get around this, this is why the assumption of having the oracle is there, as it gives us a method to determine if a turing machine will halt. Consdier the turing machine $M$ if we construct a new turing machine $N$ that is $M$ but whenever there is a transition that leads to a reject state, it instead takes us to the accept state, then when we query the oracle with $⟨ N, w ⟩$ then it accepts iff $M$ on input $w$ halts (either enters a reject or accept state). This provides us with a way to check if $M$ on input $w$ will halt in a finite number of steps making our algorithm tractable. Thus our final algorithm as as follows:

For each binary string $s$ iterated in lexicographic order
attempt to parse $s$ into $⟨ M, w ⟩$ where $M$ is a turing machine $w$ an input, if it cannot be parsed, then move to the next string otherwise continue to the next step
Make the modifiation of $M$ to $N$ as specified in the previous paragraph, and check if $⟨ N, w ⟩ \in A_{T M}$ using the oracle, if it is not an element then go to the previous step, otherwise $⟨ N, w ⟩ \in A_{T M}$ therefore we know that $M$ on input $w$ halts
Run $M$ on input $w$ and if what remains on the tape is $x$ then return $| s |$ and stop this algorithm

Also note that even though the above seems to be a potentially infinite loop, but we can make it finite as we already know that there is some $c \in ℕ$ such that $K (x) \leq x + c$ and so we only have to test strings up until this cut off.

The Acceptance Set for Oracle Turing Machines Is Undecidable

As per title.

Suppose for the sake of contradiction that $A_{T M}^{'}$ is decidable relative to $A_{T M}$ , which is to say that there is an oracle machine $H^{A_{T M}}$ that decides $A_{T M}^{'}$ , and now we construct the following which we will call $N$

On input $⟨ M ⟩$
Run $H^{A_{T M}}$ on $⟨ M, ⟨ M ⟩ ⟩$
If $H^{A_{T M}}$ accepts, then reject and if it rejects then accept

Since this turing machine just inverts the output of a decider it is also a decider. If we run $N$ on the input $⟨ N ⟩$ , then it will accept iff $N$ rejects $⟨ N ⟩$ , which is a contradiction, and it will reject iff $N$ accepts $⟨ N ⟩$ which is also a contradiction, therefore our assumption is incorrect and so $A_{T M}^{'}$ is undecidable.

There Exists Languages That Are Not Recognizable by a Turing Machine With an Oracle for the Acceptance Set for Turing Machines

As per title.

We already know that there are uncountably many languages, and we also discovered that there are countably many turing machines, and thus there are countably many turing machines that have an oracle for

A_{T M}

, therefore each such turing machine defines a language, but there will still only be countably many of those, and thus since there are uncountably many languages, there must be a langauage for which a turing machine with such an oracle can never generate; as needed.

The Language of Turing Machines That Halt on All Inputs Is Neither Recognizable nor Co-recognizable

As per title.

Recall that $H = {⟨ M, w ⟩ : M (w) halts}$ is not decidable, but it is recognizable. Because of that we know that $H$ must not be recognizable, if it were then we would be able to construct a decider for $H$ .

If $H$ is not recognizable, then if we can make a mapping reduction to $H A = {⟨ N ⟩ : N halts on all inputs}$ , then we could deduce that $H A$ is unrecognizable as well. In order to do this we would have to come up with a function $f$ such that $x \in H ⟺ f (x) \in H A$ , in otherwords we need a computable function $f$ such that whenever we have a turing machine and input $⟨ M, x ⟩$ that does not halt, it gets mapped to a turing machine $N$ that always halts, and if $f (x)$ is a turing machine that always halts, then it must be that $x$ is of the form $⟨ M, w ⟩$ where $M$ loops on input $w$ .

We construct the function $f$ as follows, so that on input $⟨ M, w ⟩$ :

We construct a turing machine $N$ such that on input $x$
Simulate $M$ on input $w$ for $| x |$ iterations
If $M$ halted, then go into an infinite loop
If $M$ has not yet halted, then halt

Now supose that $x \in H$ , therefore $x = ⟨ M, w ⟩$ where $M$ loops on input $w$ , then $f (x) = N$ where $N$ is specified above. Then for any input $y$ if we run $N (y)$ then since $M$ loops, we will hit the second if statement, and $N$ will halt, which implies that $N \in H A$ , so $x \in H ⟹ f (x) \in H A$

Now suppose that $x \notin H$ so that $x = ⟨ M, w ⟩$ where $M$ halts on input $w$ , then $f (x) = N$ now since $M$ halts on $w$ , suppose that it halts in $k$ steps, if that's the case then we select an input $y$ such that $| y | > | k |$ , so that after $| y |$ iterations, we've guarenteed that $H$ has halted on $w$ so that we go into the first if statement and loop forever, this implies that $N \notin H A$ . This concludes showing that $x \in H ⟺ f (x) \in H A$ , therefore since $f$ is computable, then we've showing that $H$ is mapping reducible to $H A$ and since $H$ is unrecognizable so is $H A$ .

We will show that the complement of $H A$ which is the set $H A = {⟨ M ⟩ : M loops on at least one input}$ is not recognizable. We do a similar reduction from $H$ , given any $⟨ M, w ⟩$ we construct a machine $N$ such that on input $x$ , it ignores input $x$ and just simulates $M$ on input $w$ .

Suppose $x \in H$ , therefore $x = ⟨ M, w ⟩$ , and $M$ does not halt on $w$ , therefore the machine $M$ doesn't halt on any input, and so $⟨ M ⟩ \in L$ .

Now suppose that $x \notin H$ that is to say that $x = ⟨ M, w ⟩$ and we know that $M$ halts on input $w$ , therefore $M$ halts on every input, meaning that $M \notin H A$ , this shows that $x \in H ⟺ f (x) \in H A$ , thus since $H$ was unrecognizable, so is $H A$ .

5.22, 5.23 5.225.22