#### Non-Regular Language

#### Definition

A Non-regular language is a type of formal language that cannot be defined by a regular expression or recognized by a finite automata. If a language contains strings where the necessary memory increases with the length of the string, it is probably not a regular language.

#### Example

A Simple and Canonical Non-regular Language:

L = { $0^{n} 1^{n}$ | $n \ge 0$}

This language cannot be considered regular, as recognizing it requires counting the number of zeros in the input, which cannot be done using a fixed amount of memory.

#### Methods for Proving Non-regularity

1) Pumping Lemma

2) Fooling sets(Distinguishing suffixes)

3) Closure properties

#### Fooling Sets

Before diving into the concept of fooling sets, let’s first understand the concept of distinguishable states and when two strings are considered distinguishable with respect to a language.

#### Distinguishable States

For a DFA M = (Q, $\Sigma$, $\delta$, s, A). Two states $p, q \in Q$ are distinguishable if there exists a string $w \in \Sigma^{*}$, such that

$\delta^* (p,w) \in A$ and $\delta^* (q,w) \not \in A$

or

$\delta^* (p,w) \not \in A$ and $\delta^* (q,w) \in A$

#### Example

Let us take two states $q_{0}$ and $q_{1}$ from the above DFA. Let w=0,

$\delta(q_{0},0)=q_{1} \not \in A$

$\delta(q_{1},0)=q_{2} \in A$

Therefore, $q_{0}$ and $q_{1}$ are distinguishable states.

#### Distinguishable Prefixes

Two strings x and y are distinguishable for a Language L if there exists a suffix w such that exactly one of xw and yw belongs to L.

These distinguishable prefixes will result in distinguishable states when we construct a DFA or NFA.

##### Proof by figure

Here, x and y are distinguishable prefixes as they end up in distinguishable states which cannot be merged. xw and yw are distinguishable as $xw \in L$ and $yw \not \in L$.

#### Fooling Sets-Definition

For a language L over $\Sigma$ a set of strings F (could be infinite) is a fooling set or distinguishing set for L if every two distinct strings $x, y \in F$ are distinguishable.

If a Language L has an infinite fooling set F then L is not regular. If a language has an infinite fooling set, it will lead to an infinite number of distinguishable prefixes, requiring an infinite number of states to recognize the language. Since a Finite automata (DFA/NFA) cannot have an infinite number of states, it is not possible to construct a Finite automata for the language, making it a non-regular language.

The fooling set method is a proof by contradiction technique which involves the following steps:

- Select a set of strings (x,y) as the prefixes of strings in the language.
- Show that for a suffix z, either xz or yz belongs to the language L, making x and y distinguishable prefixes.
- In the DFA representing language L, the states representing these prefixes must be distinct because they are distinguishable prefixes.
- If there are an infinite number of distinguishable prefixes, then the DFA representing the language would have an infinite number of states.
- Since this is not possible, a DFA cannot represent the language, thus the language is not regular.

#### Example

Prove language L = $\{0^{n} 1^{n} | n \ge 0 \}$ is not regular using fooling sets.

##### Solution

Let, F = $\{0^{i} | i \ge 0\}$ be the fooling set for the Language L.

Then, $x=0^{i},y=0^{j}$ for some non-negative integers i and j where $i \not = j$

Let, $z = 1^{i}$

Then, $xz = 0^{i}1^{i} \in L $

And $yz= 0^{j}1^{i} \not \in L$ as $i \not = j$

Thus, F is a fooling set for L. Because F is infinite, L cannot be regular.

#### Closure Properties

Using existing non-regular languages and regular languages we prove that some new language is non-regular. If we have to prove a Language L is non-regular. We combine L with known regular languages using regularity-preserving operations, to obtain a known non-regular language.

Consider, $L’ = L \cap (0^* 1^*)$

If L’ is a known non-regular language then it implies that L is non-regular. Why? Suppose L is regular. Then since $(0^* 1^*)$ is regular, and regular languages are closed under intersection, L’ also would be regular. But we know L’ is not regular, a contradiction. This means that L is also not regular.

#### Additional Resources

- Textbooks
- Erickson, Jeff.
*Algorithms* - Sipser, Michael.
*Introduction to the Theory of Computation*- Chapter 1 - Regular Languages - 1.4 Nonregular languages

- Erickson, Jeff.
- Sariel’s Lecture 6