Functions as arguments
We are already familiar with the method sort
and the function sorted
, which are used to sort lists into their natural order. For numbers and strings this usually works just fine. For anything more complicated than that, however, what Python deems to be the natural order of items is not always what was intended by us as programmers.
For example, a list of tuples is, by default, sorted based on the first item of each tuple:
products = [("banana", 5.95), ("apple", 3.95), ("orange", 4.50), ("watermelon", 4.95)]
products.sort()
for product in products:
print(product)
('apple', 3.95) ('banana', 5.95) ('orange', 4.5) ('watermelon', 4.95)
But what if we wanted to sort the list based on the price?
Functions as arguments
A sorting method or function usually accepts an optional second argument which allows you to bypass the default sorting criteria. This second argument is a function which defines how the value of each item on the list is determined. As the list is sorted, Python calls this function when it compares the items to each other.
Let's take a look at an example:
def order_by_price(item: tuple):
# Return the price, which is the second item within the tuple
return item[1]
if __name__ == "__main__":
products = [("banana", 5.95), ("apple", 3.95), ("orange", 4.50), ("watermelon", 4.95)]
# Use the function order_by_price for sorting
products.sort(key=order_by_price)
for product in products:
print(product)
('apple', 3.95) ('orange', 4.5) ('watermelon', 4.95) ('banana', 5.95)
Now the list is sorted based on the prices of the items, but what really happens in the program?
The function order_by_price
is actually pretty simple. It takes one item as its argument and returns a value for that item. More specifically, it returns the second item in the tuple, which represents the price. But then we have this line of code, where the sort
method is called:
products.sort(key=order_by_price)
Here the sort
method is called with a function as its argument. This is not a reference to the return value of the function, but a reference to the function itself. The sort
method calls this function multiple times, using each item on the list as the argument in turn.
If we include an extra print statement in the function definition of order_by_price
, we can verify that the function does indeed get called once per each item on the list:
def order_by_price(item: tuple):
# Print the item
print(f"Function call: order_by_price({item})")
# Return the price, which is the second item within the tuple
return item[1]
products = [("banana", 5.95), ("apple", 3.95), ("orange", 4.50), ("watermelon", 4.95)]
# Use the function order_by_price for sorting
products.sort(key=order_by_price)
for product in products:
print(product)
Function call: order_by_price(('banana', 5.95)) Function call: order_by_price(('apple', 3.95)) Function call: order_by_price(('orange', 4.5)) Function call: order_by_price(('watermelon', 4.95)) ('apple', 3.95) ('orange', 4.5) ('watermelon', 4.95) ('banana', 5.95)
The order can be reversed with another keyword argument; reverse
, which is available with both the sort
method and the sorted
function:
products.sort(key=order_by_price, reverse=True)
t2 = sorted(products, key=order_by_price, reverse=True)
A function definition within a function definition
We could also include a named function for this new price-based sort functionality we created. Let's add a function named sort_by_price
:
def order_by_price(item: tuple):
return item[1]
def sort_by_price(items: list):
# use the order_by_price function here
return sorted(items, key=order_by_price)
products = [("banana", 5.95), ("apple", 3.95), ("orange", 4.50), ("watermelon", 4.95)]
for product in sort_by_price(products):
print(product)
If we know that the helper function order_by_price
is not used anywhere outside the sort_by_price
function, we can place the former function definition within the latter function definition:
def sort_by_price(items: list):
# helper function defined within the function
def order_by_price(item: tuple):
return item[1]
return sorted(items, key=order_by_price)
Sorting collections of your own objects
Using the same principle, let's write a program which sorts a list of objects from our own Student
class in two different ways:
class Student:
""" The class models a single student """
def __init__(self, name: str, id: str, credits: int):
self.name = name
self.id = id
self.credits = credits
def __str__(self):
return f"{self.name} ({self.id}), {self.credits} cr."
def by_id(item: Student):
return item.id
def by_credits(item: Student):
return item.credits
if __name__ == "__main__":
o1 = Student("Archie", "a123", 220)
o2 = Student("Marvin", "m321", 210)
o3 = Student("Anna", "a999", 131)
students = [o1, o2, o3]
print("Sort by id:")
for student in sorted(students, key=by_id):
print(student)
print()
print("Sort by credits:")
for student in sorted(students, key=by_credits):
print(student)
Sort by id: Archie (a123), 220 cr. Anna (a999), 131 cr. Marvin (m321), 210 cr.
Sort by credits: Anna (a999), 131 cr. Marvin (m321), 210 cr. Archie (a123), 220 cr.
As you can see above, sorting by different criteria works exactly as intended. If the functions by_id
and by_credits
are not needed elsewhere, there are ways of making the implementation simpler. We will return to this topic after these exercises.
Lambda expressions
We have mostly worked with functions from the viewpoint of modularity. It is true that functions play an important role in managing the complexity of your programs and avoiding code repetition. Functions are usually written so that they can be used many times.
But sometimes you need something resembling a function that you will use just once. Lambda expressions allow you to create small, anonymous functions which are created (and discarded) as they are needed in the code. The general syntax is as follows:
lambda <parameters> : <expression>
Sorting a list of tuples by the second item in each tuple would look like this implemented with a lambda expression:
products = [("banana", 5.95), ("apple", 3.95), ("orange", 4.50), ("watermelon", 4.95)]
# Function is created "on the fly" with a lambda expression:
products.sort(key=lambda item: item[1])
for product in products:
print(product)
('apple', 3.95) ('orange', 4.5) ('watermelon', 4.95) ('banana', 5.95)
The expression
lambda item: item[1]
is equivalent to the function definition
def price(item):
return item[1]
except for the fact that a lambda function doesn't have a name. This is why lambda functions are called anonymous functions.
In every other respect a lambda function is no different from any other function, and they can be used in all the same contexts as any equivalent named function. For example, the following program sorts a list of strings alphabetically by the last character in each string:
strings = ["Mickey", "Mack", "Marvin", "Minnie", "Merl"]
for word in sorted(strings, key=lambda word: word[-1]):
print(word)
Minnie Mack Merl Marvin Mickey
We can also combine list comprehensions, the join
method and lambda expressions. For example, we could sort strings based on just the vowels in them, ignoring all other characters:
strings = ["Mickey", "Mack", "Marvin", "Minnie", "Merl"]
for word in sorted(strings, key=lambda word: "".join([c for c in word if c in "aeiou"])):
print(word)
Mack Marvin Merl Mickey Minnie
Anonymous functions can also be used with other built in Python functions, not just those used for sorting. For example, the min
and max
functions also take a keyword argument named key
. It is used as the criteria for comparing the items when selecting the minimum or maximum value.
In the following example we are dealing with audio recordings. First we select the oldest recording, and then the longest:
class Recording:
""" The class models a single audio recording """
def __init__(self, name: str, performer: str, year: int, runtime: int):
self.name = name
self.performer = performer
self.year = year
self.runtime = runtime
def __str__(self):
return f"{self.name} ({self.performer}), {self.year}. {self.runtime} min."
if __name__ == "__main__":
r1 = Recording("Nevermind", "Nirvana", 1991, 43)
r2 = Recording("Let It Be", "Beatles", 1969, 35)
r3 = Recording("Joshua Tree", "U2", 1986, 50)
recordings = [r1, r2, r3]
print("The oldest recording:")
print(min(recordings, key=lambda rec: rec.year))
print("The longest recording:")
print(max(recordings, key=lambda rec: rec.runtime))
The oldest recording: Let It Be (Beatles), 1969. 35 min. The longest recording: U2 (Joshua Tree), 1986. 50 min.
Functions as arguments within your own functions
We established above that it is possible to pass a reference to a function as an argument to another function. To wrap this section up, let's write our very own function which takes a function as its argument.
# the type hint "callable" refers to a function
def perform_operation(operation: callable):
# Call the function which was passed as an argument
return operation(10, 5)
def my_sum(a: int, b: int):
return a + b
def my_product(a: int, b: int):
return a * b
if __name__ == "__main__":
print(perform_operation(my_sum))
print(perform_operation(my_product))
print(perform_operation(lambda x,y: x - y))
15 50 5
The value returned by the function perform_operation
depends on which function was passed as an argument. Any function which accepts two arguments would do, no matter whether it is anonymous or named.
Passing references to functions as arguments to other functions might not be something you will end up doing on a daily basis in your programming career, but it can be a useful technique. This following program selects some lines from one file and writes them to another file. The way the lines are selected is determined by a function which returns True
only if the lines should be copied:
def copy_lines(source_file: str, target_file: str, criterion= lambda x: True):
with open(source_file) as source, open(target_file, "w") as target:
for line in source:
# Remove any whitespace from beginning and end of line
line = line.strip()
if criterion(line):
target.write(line + "\n")
# Some examples
if __name__ == "__main__":
# If the third parameter is not given, copy all lines
copy_lines("first.txt", "second.txt")
# Copy all non-empty lines
copy_lines("first.txt", "second.txt", lambda line: len(line) > 0)
# Copy all lines which contain the word "Python"
copy_lines("first.txt", "second.txt", lambda line: "Python" in line)
# Copy all lines which do not end in a full stop
copy_lines("first.txt", "second.txt", lambda line: line[-1] != ".")
The function definition contains a default value for the keyword parameter criterion
: lambda x: True
. This anonymous function always returns True
regardless of the input. So, the default behaviour is to copy all lines. As usual, if a value is given for a parameter with a default value, the new value replaces the default value.
You can check your current points from the blue blob in the bottom-right corner of the page.