Functions as arguments

We are already familiar with the method sort and the function sorted, which are used to sort lists into their natural order. For numbers and strings this usually works just fine. For anything more complicated than that, however, what Python deems to be the natural order of items is not always what was intended by us as programmers.

For example, a list of tuples is, by default, sorted based on the first item of each tuple:

products = [("banana", 5.95), ("apple", 3.95), ("orange", 4.50), ("watermelon", 4.95)]

products.sort()

for product in products:
    print(product)

Sample output

('apple', 3.95) ('banana', 5.95) ('orange', 4.5) ('watermelon', 4.95)

But what if we wanted to sort the list based on the price?

Functions as arguments

A sorting method or function usually accepts an optional second argument which allows you to bypass the default sorting criteria. This second argument is a function which defines how the value of each item on the list is determined. As the list is sorted, Python calls this function when it compares the items to each other.

Let's take a look at an example:

def order_by_price(item: tuple):
    # Return the price, which is the second item within the tuple
    return item[1]

if __name__ == "__main__":
    products = [("banana", 5.95), ("apple", 3.95), ("orange", 4.50), ("watermelon", 4.95)]

    # Use the function order_by_price for sorting
    products.sort(key=order_by_price)

    for product in products:
        print(product)

Sample output

('apple', 3.95) ('orange', 4.5) ('watermelon', 4.95) ('banana', 5.95)

Now the list is sorted based on the prices of the items, but what really happens in the program?

The function order_by_price is actually pretty simple. It takes one item as its argument and returns a value for that item. More specifically, it returns the second item in the tuple, which represents the price. But then we have this line of code, where the sort method is called:

products.sort(key=order_by_price)

Here the sort method is called with a function as its argument. This is not a reference to the return value of the function, but a reference to the function itself. The sort method calls this function multiple times, using each item on the list as the argument in turn.

If we include an extra print statement in the function definition of order_by_price, we can verify that the function does indeed get called once per each item on the list:

def order_by_price(item: tuple):
    # Print the item
    print(f"Function call: order_by_price({item})")

    # Return the price, which is the second item within the tuple
    return item[1]


products = [("banana", 5.95), ("apple", 3.95), ("orange", 4.50), ("watermelon", 4.95)]

# Use the function order_by_price for sorting
products.sort(key=order_by_price)

for product in products:
    print(product)

Sample output

Function call: order_by_price(('banana', 5.95)) Function call: order_by_price(('apple', 3.95)) Function call: order_by_price(('orange', 4.5)) Function call: order_by_price(('watermelon', 4.95)) ('apple', 3.95) ('orange', 4.5) ('watermelon', 4.95) ('banana', 5.95)

The order can be reversed with another keyword argument; reverse, which is available with both the sort method and the sorted function:

products.sort(key=order_by_price, reverse=True)

t2 = sorted(products, key=order_by_price, reverse=True)

A function definition within a function definition

We could also include a named function for this new price-based sort functionality we created. Let's add a function named sort_by_price:

def order_by_price(item: tuple):
    return item[1]

def sort_by_price(items: list):
    # use the order_by_price function here
    return sorted(items, key=order_by_price)

products = [("banana", 5.95), ("apple", 3.95), ("orange", 4.50), ("watermelon", 4.95)]

for product in sort_by_price(products):
    print(product)

If we know that the helper function order_by_price is not used anywhere outside the sort_by_price function, we can place the former function definition within the latter function definition:

def sort_by_price(items: list):
    # helper function defined within the function
    def order_by_price(item: tuple):
        return item[1]

    return sorted(items, key=order_by_price)

Sorting collections of your own objects

Using the same principle, let's write a program which sorts a list of objects from our own Student class in two different ways:

class Student:
    """ The class models a single student """
    def __init__(self, name: str, id: str, credits: int):
        self.name = name
        self.id = id
        self.credits = credits

    def __str__(self):
        return f"{self.name} ({self.id}), {self.credits} cr."


def by_id(item: Student):
    return item.id

def by_credits(item: Student):
    return item.credits


if __name__ == "__main__":
    o1 = Student("Archie", "a123", 220)
    o2 = Student("Marvin", "m321", 210)
    o3 = Student("Anna", "a999", 131)

    students = [o1, o2, o3]

    print("Sort by id:")
    for student in sorted(students, key=by_id):
        print(student)

    print()

    print("Sort by credits:")
    for student in sorted(students, key=by_credits):
        print(student)

Sample output

Sort by id: Archie (a123), 220 cr. Anna (a999), 131 cr. Marvin (m321), 210 cr.

Sort by credits: Anna (a999), 131 cr. Marvin (m321), 210 cr. Archie (a123), 220 cr.

As you can see above, sorting by different criteria works exactly as intended. If the functions by_id and by_credits are not needed elsewhere, there are ways of making the implementation simpler. We will return to this topic after these exercises.

Lambda expressions

We have mostly worked with functions from the viewpoint of modularity. It is true that functions play an important role in managing the complexity of your programs and avoiding code repetition. Functions are usually written so that they can be used many times.

But sometimes you need something resembling a function that you will use just once. Lambda expressions allow you to create small, anonymous functions which are created (and discarded) as they are needed in the code. The general syntax is as follows:

lambda <parameters> : <expression>

Sorting a list of tuples by the second item in each tuple would look like this implemented with a lambda expression:

products = [("banana", 5.95), ("apple", 3.95), ("orange", 4.50), ("watermelon", 4.95)]

# Function is created "on the fly" with a lambda expression:
products.sort(key=lambda item: item[1])

for product in products:
    print(product)

Sample output

('apple', 3.95) ('orange', 4.5) ('watermelon', 4.95) ('banana', 5.95)

The expression

lambda item: item[1]

is equivalent to the function definition

def price(item):
    return item[1]

except for the fact that a lambda function doesn't have a name. This is why lambda functions are called anonymous functions.

In every other respect a lambda function is no different from any other function, and they can be used in all the same contexts as any equivalent named function. For example, the following program sorts a list of strings alphabetically by the last character in each string:

strings = ["Mickey", "Mack", "Marvin", "Minnie", "Merl"]

for word in sorted(strings, key=lambda word: word[-1]):
    print(word)

Sample output

Minnie Mack Merl Marvin Mickey

We can also combine list comprehensions, the join method and lambda expressions. For example, we could sort strings based on just the vowels in them, ignoring all other characters:

strings = ["Mickey", "Mack", "Marvin", "Minnie", "Merl"]

for word in sorted(strings, key=lambda word: "".join([c for c in word if c in "aeiou"])):
    print(word)

Sample output

Mack Marvin Merl Mickey Minnie

Anonymous functions can also be used with other built in Python functions, not just those used for sorting. For example, the min and max functions also take a keyword argument named key. It is used as the criteria for comparing the items when selecting the minimum or maximum value.

In the following example we are dealing with audio recordings. First we select the oldest recording, and then the longest:


class Recording:
    """ The class models a single audio recording """
    def __init__(self, name: str, performer: str, year: int, runtime: int):
        self.name = name
        self.performer = performer
        self.year = year
        self.runtime = runtime


    def __str__(self):
        return f"{self.name} ({self.performer}), {self.year}. {self.runtime} min."

if __name__ == "__main__":
    r1 = Recording("Nevermind", "Nirvana", 1991, 43)
    r2 = Recording("Let It Be", "Beatles", 1969, 35)
    r3 = Recording("Joshua Tree", "U2", 1986, 50)

    recordings = [r1, r2, r3]


    print("The oldest recording:")
    print(min(recordings, key=lambda rec: rec.year))

    print("The longest recording:")
    print(max(recordings, key=lambda rec: rec.runtime))

Sample output

The oldest recording: Let It Be (Beatles), 1969. 35 min. The longest recording: U2 (Joshua Tree), 1986. 50 min.

Functions as arguments within your own functions

We established above that it is possible to pass a reference to a function as an argument to another function. To wrap this section up, let's write our very own function which takes a function as its argument.

# the type hint "callable" refers to a function
def perform_operation(operation: callable):
    # Call the function which was passed as an argument
    return operation(10, 5)

def my_sum(a: int, b: int):
    return a + b

def my_product(a: int, b: int):
    return a * b


if __name__ == "__main__":
    print(perform_operation(my_sum))
    print(perform_operation(my_product))
    print(perform_operation(lambda x,y: x - y))

Sample output

15 50 5

The value returned by the function perform_operation depends on which function was passed as an argument. Any function which accepts two arguments would do, no matter whether it is anonymous or named.

Passing references to functions as arguments to other functions might not be something you will end up doing on a daily basis in your programming career, but it can be a useful technique. This following program selects some lines from one file and writes them to another file. The way the lines are selected is determined by a function which returns True only if the lines should be copied:

def copy_lines(source_file: str, target_file: str, criterion= lambda x: True):
    with open(source_file) as source, open(target_file, "w") as target:
        for line in source:
            # Remove any whitespace from beginning and end of line
            line = line.strip()

            if criterion(line):
                target.write(line + "\n")

# Some examples
if __name__ == "__main__":
    # If the third parameter is not given, copy all lines
    copy_lines("first.txt", "second.txt")

    # Copy all non-empty lines
    copy_lines("first.txt", "second.txt", lambda line: len(line) > 0)

    # Copy all lines which contain the word "Python"
    copy_lines("first.txt", "second.txt", lambda line: "Python" in line)

    # Copy all lines which do not end in a full stop
    copy_lines("first.txt", "second.txt", lambda line: line[-1] != ".")

The function definition contains a default value for the keyword parameter criterion: lambda x: True. This anonymous function always returns True regardless of the input. So, the default behaviour is to copy all lines. As usual, if a value is given for a parameter with a default value, the new value replaces the default value.

You have reached the end of this section! Continue to the next section:

2. Generators

You can check your current points from the blue blob in the bottom-right corner of the page.