We want to do even more with regular expressions, such as matching a word or a number. To do this, we’re going to introduce a visual representation for regular expressions that actually shows exactly what’s going on behind the scenes, and then we’re going to follow along in Python. Suppose we have the regular expression [0 – 9] + % sign. Any character like this that just appears on its own is matched directly, so this catches strings like 30%, 99%, 2% and various other things we might find describing sales or the fat content of milk. Here I’ve drawn a finite state machine, a visual representation of this regular expression. Often there’s an arrow coming out of nowhere on the left that’s not connected to the rest of the picture. That indicates where we start. These 3 circles are states. They represent what we’re up to when we’re matching a string against the regular expression– what configuration we’re in, what our current state of mind is, what we’ve seen so far. I’ve labeled my states 1, 2, and 3. These other arrows are called edges or transitions. They tell us when to move from 1 state to another. I start in state 1, and if I see a 0 – 9, I move over to state 2. This 0 – 9 is the label associated with this edge. Finally, you’ll notice that 1 of my states has a double circle. That’s an accepting state. If we end up in an accepting state at the end of the input, this finite state machine matches the given string. Let’s trace through what happens on input 23%. We start in the start state, and the character we see is a 2, so we follow this edge to state 2. Now the next thing we see is a 3, so we follow this edge back to state 2. These are sometimes called self-loops. It’s a loop that takes me back to right where I started. Now we see the % sign, and we end up in state 3, which is an accepting state, so our finite state machine accepts this string ‘23%’ just like our regular expression would. Let’s try just the string ‘2’. We start in the start state. We see a 2, so we move over here, and then we’re done. We ran out of input, but we’re not in an accepting state. Our finite state machine rejects this just like our regular expression would. Finally, let’s consider the string ‘2x’. We start here in state 1. We see a 2, so we go over to state 2. Then we see an x, and there’s no outgoing edge from state 2 on an x, so we fall off the finite state machine and die. This is very sad, and when this happens our finite state machine does not accept the string, just like the regular expression would not.

Isn't the label for the second edge a plus? "+"

It helped. ðŸ¤—