Writing files
So far we have read data from files, but it is naturally also possible to write data to files. Typically a program processes data and stores the results in a file, so they can be used later or processed further by some other program.
We can create a new file every time we want to write data to a file, but we can also append new data to the end of an existing file. In both cases we use the open
function from the previous section. For writing files the function requires a second argument.
Creating a new file
If you want to create a new file, you would call the open
function with the additional argument w
, to signify that the file should be opened in write mode. So, the function call could look like this:
with open("new_file.txt", "w") as my_file:
# code to write something to the file
NB: if the file already exists, all the contents will be overwritten. It pays to be very careful when creating new files.
With the file open you can write data to it. You can use the method write
, which takes the string that is to be written as its argument.
with open("new_file.txt", "w") as my_file:
my_file.write("Hello there!")
When you execute the program, a new file named new_file.txt
appears in the directory. The contents would look like this:
Hello there!
If you want line breaks in the file, you will have to add them by hand - the write
function doesn't work exactly like the more familiar print
function, though they are similar. So, the following program
with open("new_file.txt", "w") as my_file:
my_file.write("Hello there!")
my_file.write("This is the second line")
my_file.write("This is the last line")
would result in a file with these contents:
Hello there!This is the second lineThis is the last line
Line breaks are achieved by adding new line characters \n
to the argument strings:
with open("new_file.txt", "w") as my_file:
my_file.write("Hello there!\n")
my_file.write("This is the second line\n")
my_file.write("This is the last line\n")
Now the contents of new_file.txt
would look like this:
Hello there! This is the second line This is the last line
Appending data to an existing file
If you want to append data to the end of a file, instead of overwriting the entire file, you should open the file in append mode with the argument a
.
If the file doesn't yet exist, append mode works exatly like write mode.
The following program opens the file new_file.txt
and appends a couple of lines of text to the end:
with open("new_file.txt", "a") as my_file:
my_file.write("This is the 4th line\n")
my_file.write("And yet another line.\n")
After this program is executed the contents of the file would look like this:
Hello there! This is the second line This is the last line This is the 4th line And yet another line.
In programming practice, appending data to files is not a very common task.
More often a file is read, processed and overwritten in its entirety. For example, when the contents should change in the middle of the file, it is usually easiest to overwrite the entire file.
Writing CSV files
CSV files can be written line by line with the write
method just like any other file. The following example creates the file coders.csv
, with each line containing the name, working environment, favourite language and years of experience of a single programmer. The fields are separated by a semicolon.
with open("coders.csv", "w") as my_file:
my_file.write("Eric;Windows;Pascal;10\n")
my_file.write("Matt;Linux;PHP;2\n")
my_file.write("Alan;Linux;Java;17\n")
my_file.write("Emily;Mac;Cobol;9\n")
Executing this program would result in the following file:
Eric;Windows;Pascal;10 Matt;Linux;PHP;2 Alan;Linux;Java;17 Emily;Mac;Cobol;9
What if the data to be written is stored in computer memory in a list?
coders = []
coders.append(["Eric", "Windows", "Pascal", 10])
coders.append(["Matt", "Linux", "PHP", 2])
coders.append(["Alan", "Linux", "Java", 17])
coders.append(["Emily", "Mac", "Cobol", 9])
We can build the string we want to write as an f-string, and write the ready line to the file like so:
with open("coders.csv", "w") as my_file:
for coder in coders:
line = f"{coder[0]};{coder[1]};{coder[2]};{coder[3]}"
my_file.write(line+"\n")
If each list of coder data was very long, with many more items, building the string by hand would be quite cumbersome. We can use a for
loop to build the string instead:
with open("coders.csv", "w") as my_file:
for coder in coders:
line = ""
for value in coder:
line += f"{value};"
line = line[:-1]
my_file.write(line+"\n")
Clearing file contents and deleting files
Sometimes it is necessary to clear the contents of an existing file. Opening the file in write mode and closing the file immediately will achieve just this:
with open("file_to_be_cleared.txt", "w") as my_file:
pass
Now the with
block only contains the command pass
, which doesn't actually do anything. Python does not allow empty blocks, so the command is necessary here.
It is possible to also bypass the with
block by using the following oneliner:
open('file_to_be_cleared.txt', 'w').close()
Handling data in a CSV format
Let's write a program which assesses students' performance on a course. The program reads a CSV file, which contains weekly exercise points received by the students. The program then calculates the points total and determines the grade attained by each student. Finally, the program creates a CSV file containing the points total and grade for each student.
The CSV file given as input to the program looks like this:
Peter;4;2;3;5;4;0;0 Paula;7;2;8;3;5;4;5 Susan;3;4;3;5;3;4;4 Emily;6;6;5;5;0;4;8
The program logic is divided into three functions: reading the file and processing the contents into an accessible format, determining the grade, and writing the file.
The file is read following the principles covered in the previous section. The data is stored in a dictionary, where the key is the student's name, and the value is a list of the points received by the student, in integer format:
def read_weekly_points(filename):
weekly_points = {}
with open(filename) as my_file:
for line in my_file:
parts = line.split(";")
point_list = []
for points in parts[1:]:
point_list.append(int(points))
weekly_points[parts[0]] = point_list
return weekly_points
The second function is for determining the grade based on the points received. This function is in turn used by the third function, which writes the results to the file.
def grade(points):
if points < 20:
return 0
elif points < 25:
return 1
elif points < 30:
return 2
elif points < 35:
return 3
elif points < 40:
return 4
else:
return 5
def save_results(filename, weekly_points):
with open(filename, "w") as my_file:
for name, point_list in weekly_points.items():
point_sum = sum(point_list)
my_file.write(f"{name};{point_sum};{grade(point_sum)}\n")
This structure lets us write a very simple main function. Notice how the filenames for the files whch are read and written are given as arguments in the main function:
weekly_points = read_weekly_points("weekly_points.csv")
save_results("results.csv", weekly_points)
When the main function is executed, the contents of the file results.csv
created as a result looks like this:
Peter;18;0 Paula;34;3 Susan;26;2 Emily;41;5
Notice how each function defined above is relatively simple, and they all have a single responsibility. This is a common and advisable approach when programming larger wholes. The single reponsibility principle makes verifying functionality easier. It also makes it easier to make changes to the program later, and to add new features.
Say we wanted to add a function for printing out the grade for a single student. We already have a function which determines the student's grade, so we can use this in our new function:
def get_grade(student_name, weekly_points):
for name, point_list in weekly_points.items():
if name == student_name:
return grade(sum(point_list))
weekly_points = read_weekly_points("weekly_points.csv")
print(get_grade("Paula", weekly_points))
3
If we determine a certain functionality in the program needs fixing, in a well designed program the change will affect only some select sections of code, and it will be easier to determine where the changes should be made. For example, if we wanted to change the grade boundaries, we'd only need to implement the change in the function for determining the grade, and it would work also in all the other functions utilizing this function. If the code for this single functionality was implemented in multiple places, there would be a definite risk that we would not remember to change all the instances when changing the functionality.
You can check your current points from the blue blob in the bottom-right corner of the page.