Using the CSV module - Python Bootcamp
A comma-separated values (CSV) file is a convenient way to store data that can be easily read and written. Despite the name, many CSVs files aren’t even comma-delimited (tab is common, but people rarely call them TSV) and there is no standard way of doing it.
You might think you can trivially read and write CSVs using
.split(',') to read them and
','.join(fields) to write
them. However, this is risky; if some of the fields contain commas
themselves, those fields will be enclosed by quotes and you’ll end up
Reading a CSV
The simplest way to read a CSV is the
function. This will return a reader object that allows you to read
each row as a list. An example adapted from the documentation:
>>> import csv >>> with open('eggs.csv', 'U') as csvfile: ... spamreader = csv.reader(csvfile) ... for row in spamreader: ... print(', '.join(row)) Spam, Spam, Spam, Spam, Spam, Baked Beans Spam, Lovely Spam, Wonderful Spam
This demonstrates another useful Python construct, using
mark of a block where you will use a file, which will automatically
close it when you are done with it.
Writing a CSV
allows you to write a CSV by writing one row at a time. Note that
when you write a CSV file, you want to set the mode to
What if you don’t want to rely on hard-coding the order of the fields
in each line? You can use the
classes to help. These allow you to read and write each row as a
dictionary, with keys being the field names and values being the value
of that field in each row.
The standard Python CSV parser is designed to handle a lot of strange
input well, including Excel files. If you care more about speed than
broad features, take a look at
Here are some exercises to get you used to working with the CSV module. Try these out using a sample CSV file.
- Write a function that reads a CSV and uses a
defaultdictto store a list of the values for each item as it reads in the CSV. Use
- Write a second function that takes the dictionary produced above
and then computes the minimum, maximum, and mean values for each
item. (You’ll need to write your own function to compute the mean.)
Write these values to another CSV with four fields:
item, min, mean, max.
- When you’ve got everything working, replace the reader and writer