Anti-Patterns in Python Programming
Constantine Lignos | Last updated: July 09, 2014
Note (9/1/2019): The advice here is somewhat out of date at this point. An update is coming sometime this year to change the examples to reflect Python 3 and modern performance characteristics of iteration.
This page is a collection of the most unfortunate but occasionally subtle issues I’ve seen in code written by students new to writing Python. It’s written to help students get past the phase of writing ugly Python code and understand some of the most common idioms. The simplifications employed (for example, ignoring generators and the power of itertools when talking about iteration) reflect its intended audience.
There are always reasons to use some of these anti-patterns, which I’ve tried to give those where possible, but in general using these anti-patterns makes for less readable, more buggy, and less Pythonic code. If you’re looking for broader introductory materials for Python, I highly recommend The Python Tutorial or Dive into Python.
The use of
Programmers that are new to Python love using
perform simple iteration by applying it over the length of an
iterable and then getting each element:
Recite it in your sleep:
range is not for simple, obvious iterations over sequences.
For those used to numerically defined
range feels like home,
but using it for iteration over sequences is bug-prone and less clear than using the standard
for construct directly on an iterable. Just write:
range are prone to unfortunate off-by-one bugs. This is
commonly caused by forgetting that
range is inclusive in its first
argument and exclusive in its second, just like
and many, many, other functions of this type. Those who think too
hard about not overrunning the end of their sequence are going to
The common excuses for using
range inappropriately are:
Needing the index value in the loop. This isn't a valid excuse. Instead, you should write:
Needing to iterate over two loops at once, getting a value at the same index from each. In this case, you want to use
Needing to iterate over only part of a sequence. In this case, just iterate over a slice of the sequence and include a comment to make it clear that this was intentional:
An exception to this is when you're iterating over a sequence so big that the overhead introduced by slicing the would be very expensive. If your sequence is 10 items, this is unlikely to matter, but if it is 10 million items or this is done in a performance-sensitive inner loop, this is going to be very important. Consider using
xrangein this case.
An important use case of
range outside of iterating over a sequence
is when you genuinely need a list of numbers not to be used for indexing:
Using list comprehensions properly
If you have a loop that looks like this, you want to rewrite it as a list comprehension:
Instead, write a list comprehension:
Why do this? For one, you avoid any bugs related to correctly initializing
alist. Also, the code just looks a lot cleaner and what you’re doing is clearer.
For those from a functional programming background,
may feel more familiar, but I find it less Pythonic.
Some common excuses for not using a list comprehension:
You need to nest your loop. You can nest entire list comprehensions, or just put multiple loops inside a list comprehension. So, instead of writing:
Note that in a list comprehension with multiple loops, the loops have the same order as if you weren't making a list comprehension at all.
You need a condition inside your loop. But you can do this in a list comprehension just as easily:
A valid reason for not using a list comprehension is that you can’t do exception handling inside one. So if some items in the iteration will cause exceptions to be raised, you will need to either offload the exception handling to a function called by the list comprehension or not use a list comprehension at all.
Checking for contents in linear time
Syntactically, checking if something is contained in a list or a set/dictionary look alike, but under the hood things are different. If you need to repeatedly check whether something is contained in a data structure, use a set instead of a list. (You can use a dict if you need to associate a value with it and also get constant time membership tests.)
Keep in mind that creation of the set introduces one-time overhead; creation will take linear time even though membership testing takes constant time. So if you are checking for membership in a loop, it’s almost always worth it to take the time to build a set since you only have to build the set once.
Generally speaking, in Python the scope of a name is wider than one might expect given other languages. For example, in Java, the following code will not even compile:
However, in Python the equivalent will always compile and often produce the intended result:
This will work in all cases except when
y is empty; in that case the
loop never runs and the call to
processList will raise a
idx is not defined. If you use Pylint, it would warn you
about “Using possibly undefined loop variable idx.”
The solution is to always be explicit and set
idx to some special
value before the loop, so you know what to look for if the loop never
runs. This is called the Sentinel
Pattern. So what value should
you use for a sentinel? Starting with C or earlier, back when
ruled the Earth, a common pattern for a function that needed to return
an “expected error” result was to return
-1. For example,
let’s say you want to return the index of an item in a list:
In the general case,
None is a better sentinel of choice in Python,
even if it isn’t used consistently by Python’s standard types (e.g.,
str.find). See the style section for recommended ways to test for
The outer scope
Programmers new to Python often love to put everything in what is called the outer scope, the parts of a python file not contained in a block such as a function of class. The outer scope corresponds to the global namespace; for the purpose of this discussion, you should assume the contents of the global namespace are accessible anywhere within a single Python file.
The outer scope is great for defining constants that the whole module
needs access to that you want to declare at the top of a file. It’s
wise to give anything in the outer scope distinctive names, for
IN_ALL_CAPS. That makes it easier to avoid bugs like the
If you look closely, you’ll see that the definition of
names its argument
filenam, but the body of the function references
filename. However, this program works just fine. Why? In
print_file when a local variable named
filename isn’t found, the
next step is to look at the global namespace. Since the code that
calls print_file lives in the outer scope (even though it’s indented),
the variable filename declared there is visible to
So, how do you avoid problems like this? First, don’t set any values
in the outer scope that aren’t
IN_ALL_CAPS. Things like parsing
arguments are best delegated to a function named
main, so that any
internal variables in that function do not live in the outer scope.
This also serves as a reminder about the
global keyword. You do not
need the global keyword if you are just reading the value of a global
name. You only need it if you want to change what object a global
variable name refers to. See this discussion of the global keyword on
Stack Overflow for more
Style, style, style
Honor thy PEP 8
PEP 8 is the universal style guide for Python code. It’s wise to follow it as much as possible; the better you stick to it, the easier it will be for others to read your code. The guidelines below are all taken from PEP 8 and seem to be the ones people need to be reminded of most often.
Testing for empty
If you want to check whether a container type (e.g., list, dictionary,
set) is empty, simply test it instead of doing something
len(x) > 0:
If you want to store this result somewhere, use
bool is what is called to determine the
truth value of the target of
As I mentioned previously,
None makes a good sentinel
value. How should you check for it?
If you are specifically testing for
None and not just
other things that evaluate as
False (e.g., empty
containers, 0) use
If you are using
None as a sentinel, this is the
desired pattern; you want to distinguish
None from 0,
If you are just testing for whether you have something useful to work with, a simple if pattern is usually good enough:
For example, if
x is expected to be a container type, but could be
None based on the result of another function, this handles all the
cases that you care about at once. Just be aware that if you change
the value that goes into
x such that
0.0 are useful
values, this may not behave the way you want.
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.