Intermediate exercises - Python Bootcamp
Skills used: CSV reading, basic math, dealing with data
Your task is to write a program that summarizes the data contained in
presidents.csv. This contains data from
presidents dataset in R, quarterly approval ratings of US
presidents. Your program
avg_ratings.py should take two filenames as
command-line arguments: the CSV file to read (
the output filename.
The input file contains three columns:
id is a unique identifier for
year gives the year and quarter of the rating, and
rating gives the approval rating.
You should write your output to a new CSV with three columns:
will contain whole years (i.e.,
1975 but no information about
mean will have the mean approval rating for that
median will have the median approval rating for that year.
As some quarters are missing ratings (
NA) you will have to exclude
those from the mean/median calculations. You should write functions to
calculate the mean and median of a list of numbers and remove any
from the list before passing it to those functions.
Your output should look like the following for the first five years:
year,mean,median 1945,81.33333333333333,82 1946,47.0,46.5 1947,51.0,54.5 1948,37.5,37.5 1949,58.5,57.0
An example solution is available in avg_ratings.py.
Skills used: Listing directory contents, classes, string parsing
Your task is to write a program that aggregates data spread across a
number of files. Start by downloading the file
employee_info.zip and unzipping it to
create a folder named
In this folder, there’s a set of files of the format
with information about each employee. For example,
Name: John Smith Title: Researcher
You should write a program that reads in every file in the
employee_info folder and stores the data in a dictionary. Each key
for the dictionary should be the employee ID (which is numeric, but
you can treat it as a string without any problems), and each value
should be a object of the class
You should define the class
Employee yourself such that:
- It has an
__init__function that takes the employee name, id, and title as arguments and stores them in the class.
- It has a
__str__function that allows for nice, human-readable printing of the object. For example, in might return:
John Smith, Researcher (372).
Your program should:
- Take the target directory (
employee_info) as a command-line argument.
- List the contents of the directory.
- Open each file in the directory and store the information in that
file in a new
Employeeobject in the dictionary.
- At the end, print out all of the key-value pairs of the dictionary,
one per line, separated by a colon. For example, a line would be:
372: John Smith, Researcher (372).
- You will need to put together the name of the directory of files
and the individual filenames in order to open them. Use
- You can assume the first line of each file contains the name field and that the second line contains the title field. (If you want an extra challenge, write a solution that does not make this assumption.)
- You can use
splitto parse each string in the file. If you want an additional challenge, use a regular expression, using the module
- You should not assume that a name only consists of a first and last
name; your solution should still work if the name is “John Q Public.”
If you notice that you are only extracting the first name, you are
An example solution is available in employee_info.py.
On your own
Now’s a great time to come up with your own exercise and try it out!