The Pumping Lemma
All regular languages satisfy the pumping lemma:
The pumping lemma states that if a language is regular, then there exists some integer such that every string in at least in length can be split into substrings where:
- The substring to be “pumped” has at least length 1
- The first two substrings together has at most length
- The result of pumping the substring any number of times is also in
If you now understand the lemma given the definition above, the post ends here for you! Otherwise, I hope to explain it in more detail below.
Inside Out
Looking at the innermost statement:
The statement shows that there exists some string that can be split into substrings , , and . The length of must be at least 1 and the length of substring must be at most some constant . Additionally, pumping the string should still give a string in the language . Let’s assume the constant is the length of for now, and look at examples of pumping some string .
The three examples shown above demonstrate that the substrings are chosen in a way that pumping substring still results in a string that is in the regular language .
Note that for the same regular expression and string, dividing the string into the wrong set of substrings can result in a string that is not in the language when pumped, as shown below for example 3:
Now consider the remainder of the statement:
This shows that a regular language will have some constant such that every string that is at least characters long can be pumped according to the rules above.
Therefore, if we can demonstrate that every constant leads to some string (with length ) that cannot be pumped, we can prove that the language is non-regular.
Proving That a Language is Non-Regular
The classic non-regular language to prove using the pumping lemma is over the alphabet .
We first assume that the language is regular, and therefore satistifes the pumping lemma. Our goal: Show that every constant leads to some string (with length ) that cannot be pumped for a language .
We cannot pick a concrete value for , since we would then have to find for an infinite number of values, which is infeasible. Therefore, we need to come up with a string that is at least an unknown characters long. The smallest such string is . However, since we have freedom of choice, we’ll go with for simplicity.
According to the pumping lemma, for some arbitrary . Since and has length , and can only consist of . Additionally, since , must consist of at least 1 .
Here we have our contradiction: Since only contains , pumping even once would result in and no longer having the same number of occurrences. The pumped string would no longer be in .
Therefore, is not a regular language.
Finite vs. Infinite Regular Languages
The pumping lemma is mainly used with infinite regular languages. For finite languages, the pumping lemma is vacuously true.
Consider any finite regular language (for example, ). To pick some integer , we can pick the integer where is the maximum length of a string (in the example, ). Since there can never be a string at least in length in , we don’t need to worry about the remainder of the lemma.
For infinite regular languages, there can never be a maximum string length, so we do have to consider the pumping lemma fully.