Python Generators

In this tutorial, you will master everything about generators in python. You will explore what generators are, how to create and manipulate generators, the significance of keyword yield, how useful generators are in python, etc with the aid of simple examples.

What are generators in python?

From our previous tutorial, you have gained the knowledge for creating iterators in python which enables you to understand the working of loops at the back end. The use of the iter() method and next method, stop Iteration exception, etc makes the code more elongated. Moreover, it seems counterintuitive to memorize the facts for a large data set or programs and automaticity is more welcoming in such cases for effective execution.

Generators, in python programming, are a special kind of function or expression which generates or yields results one at a time on iterating over a generator object. One important feature that makes a generator more efficient than an iterator is the utilization of memory space. Generators usually operate on virtual sequences and are not necessarily required to store the whole sequence in memory.

How do we create generators in python?

In python, the two cool ways to create generators are using :

  • Generator Expression
  • Generator function using the yield keyword

Creating generator using Generator Expression

The use of generator expression is one way of building generators in python. This method is more convenient for small data sets like lists in python. Generator expression follows the syntax of list comprehension and hence we can use generator expression in areas where list comprehensions are utilized with a bonus of less memory consumption. The series or sequence are maintained virtually and no need to keep the whole sequence in the memory.

To understand the concept of generator expression more clearly let us examine the following simple and elegant example which shows how list and generator differ from each other.

Program 1

Str_list = ['Yellow','Orange','Red']


#  list comprehension
list = [x for x in Str_list]

# print the type
print(type(list))

# Iterate over items in list and print -1st time
for item in list:
    print(item)

# Iterate over items in list and print -2nd time
for item in list:
    print(item) 

Output:

Yellow
Orange
Red

Yellow
Orange
Red

Program 2


Str_list = ['Yellow','Orange','Red']


# Creating a generator using generator expression
gen = (x for x in Str_list)

# Print the type
print(type(gen))

# Iterate over items in generator object and print -1st time 
for item in gen:
    print(item)

# Iterate over items in generator object and print -2nd time 
for item in gen:
    print(item) 

Output:

Yellow
Orange
Red

From the above code snippet, you can observe the following:

  1. In both cases, we use the same list Str_list which contains strings of colors like Yellow, Orange, and Red.
  2. In the second line in the Program 1, we created a list object called list while in Program 2 we created a generator object named gen.
    • When you observe carefully you can spot the first difference,i.e,  to create list square brackets are used while to create generators we use parentheses or round brackets.
  3. Thirdly we try to check the type of objects created and print them. So we will get for list object and for generator object.
  4. In the fourth step, with the help of a for loop we iterate over each item in the list object and generator object and its corresponding result is printed. Both produce the same result apparently.
    • The second main difference lies in the way the list and generators keep their items in the memory. List stores all items in the memory at once while generator does not store whole items at once rather it creates items one by one on the go and displays it while storing the status of the item and on the next call, it removes the previous item from the memory.    
    • To verify this we can use the  len() method to check the length of the list and generator we created. len(list) will give you 3 as an outcome as the list contains 3 items but len(gen) will throw a Type Error stating generator object has no length.
  5. Finally, we attempt to iterate and print both the list and generator.
    • Surprisingly this led us to the third difference - Iteration can be performed on a list as many times as we can and on the other hand iteration on a generator is constrained to once.

Creating a generator using the Keyword yield

The Keyword yield plays a vital role in building a generator function in python. The statement that contains the keyword yield is referred to as yield statement. A yield statement in the generator function is used to control the flow of execution of a function as done by the return statement in the normal function.

To make it more clear let us create a generator function using yield to print a list of colors.

colours = ["Yellow", "Orange", "Red"]

def print_colours(colours):
 for c in colours:
  yield c

colour_generator = print_colours(colours)

for c in colour_generator:
 print(c) 

Output:

Yellow
Orange
Red

In the above code snippet, we have created a generator function print_colours with the help of a yield statement. The colour_generator is the object of the generator function which stores the special iterator named generator returned by the generator function. To print the elements inside a generator we have to either use the loops or next() function, here we use ‘for loop’ and hence all values in the generator listed at once.

How generator function differ from normal function

Before discussing on the difference examine the below codes to print the list of colors:

Normal Function

clr_list =["Yellow", "Orange", "Red"]

def print_colours(clrs):
    
    for c in clrs:
        return c

C_List = print_colours(clr_list)

print(C_List) 

Output:

['Yellow', 'Orange', 'Red']

Generator Function

clr_list =["Yellow", "Orange", "Red"]
def print_colours(clrs):
    
    for c in clrs:
        yield c

C_gen = print_colours(clr_list)

print(C_gen)
print(next(C_gen))
print(next(C_gen))
print(next(C_gen))
print(next(C_gen)) 

Output:

Yellow
Orange
Red
Traceback (most recent call last):
  File "gen_ex.py", line 62, in 
    print(next(C_gen))
StopIteration

When you observe both code snippets you will see the below findings.

  • The generator function contains the yield statement inside the function print_colours() while the normal function has the return statement inside the function print_colours().
  • On function call, the generator is returned by the generator function while the list is returned by the normal function.
  • Both return and yield statements are used to control the flow of execution and the key difference is that ;
    • The return statement will terminate the whole function by returning the complete values in the list.
    • The yield statement will only suspend the function for now by returning the yielded value.
  • C_List contains the complete list while C_gen contains the special iterator called a generator. This can be verified by printing them
    • We get the entire list ["Yellow", "Orange", "Red"]when the print(C_List) statement gets executed while we get the output <generator object print_colours at 0x032ECB18> when print(C_gen) gets executed. This implies that no element in the list is stored in the memory only an iterator is created.
  • In the generator function, we need some special methods like next()  to activate the execution of the function whereas the normal function doesn't need.
    • When the next()  function is encountered the generator function print_colours will start execution until it hits the yield statement.
    • The yield statement will return the yielded value to the caller while suspending the execution of the function. The yield statement will always save the state of that function and hence will not iterate over elements that are already visited.
    • This will repeat until the next() function throws a StopIteration exception when no elements are present in the list.
    • In the above example, you can see we have used the next() function 4 times while the list contains only 3 elements and so the first three next()  function returns the 3 elements( Yellow, Orange, and Red) and the fourth next()  function raises the StopIteration exception.

Generator function and for loop

Now we are well acquainted with generators and how it is working behind the scene. The use of next() function is not a fair practice since we need to mention it multiple times. Loops can overrule the next() function since it is more convenient.  See the below example to reverse a list.

# generator to reverse a list
def reverse_list(clr_list):
    length = len(clr_list)
    for i in range(length-1, -1, -1):
        yield clr_list[i]

# using for loop to reverse the list
for list in reverse_list(["Yellow", "Orange", "Red"]):
    print(list) 

Output:

Red
Orange
Yellow

In this script, the execution starts with calling the generator function reverse_list inside the for a loop. The generator function will reverse the given list with the aid of  range() function. The range() function inside the for loop is used to get the index of the list in reverse order. Thus we are yielding list elements in the reverse order.  Since the function call is written inside the for loop, it executes till the last element is encountered.

Note: Python generators work well with all iterables.

The two main advantage of using generators is:

  • Compared to iterators generators are more efficient and easy to implement with a few lines of code.
  • Since generators use virtual space for storing the state of a function, we can say generators are memory-efficient when it comes to large data sets.