Guest post from Toptal Engineering Blog.
Python is an interpreted, object-oriented, high-level programming language with dynamic semantics. Its high-level built in data structures, combined with dynamic typing and dynamic binding, make it very attractive for Rapid Application Development, as well as for use as a scripting or glue language to connect existing components or services. Python supports modules and packages, thereby encouraging program modularity and code reuse.
About this article
Python’s simple, easy-to-learn syntax can mislead Python developers – especially those who are newer to the language – into missing some of its subtleties and underestimating the power of the diverse Python language.
With that in mind, this article presents a “top 10” list of somewhat subtle, harder-to-catch mistakes that can bite even some more advanced Python developers in the rear.
(Note: This article is intended for a more advanced audience than Common Mistakes of Python Programmers, which is geared more toward those who are newer to the language.)
Common Mistake #1: Misusing expressions as defaults for function arguments
Python allows you to specify that a function argument is optional by providing a default value for it. While this is a great feature of the language, it can lead to some confusion when the default value is mutable. For example, consider this Python function definition:
A common mistake is to think that the optional argument will be set to the specified default expression each time the function is called without supplying a value for the optional argument. In the above code, for example, one might expect that calling repeatedly (i.e., without specifying a argument) would always return , since the assumption would be that each time is called (without a argument specified) is set to (i.e., a new empty list).
But let’s look at what actually happens when you do this:
Huh? Why did it keep appending the default value of to an existing list each time was called, rather than creating a new list each time?
The more advanced Python programming answer is that the default value for a function argument is only evaluated once, at the time that the function is defined. Thus, the argument is initialized to its default (i.e., an empty list) only when is first defined, but then calls to (i.e., without a argument specified) will continue to use the same list to which was originally initialized.
FYI, a common workaround for this is as follows:
Common Mistake #2: Using class variables incorrectly
Consider the following example:
Yup, again as expected.
What the $%#!&?? We only changed . Why did change too?
In Python, class variables are internally handled as dictionaries and follow what is often referred to as Method Resolution Order (MRO). So in the above code, since the attribute is not found in class , it will be looked up in its base classes (only in the above example, although Python supports multiple inheritance). In other words, doesn’t have its own property, independent of . Thus, references to are in fact references to . This causes a Python problem unless it’s handled properly. Learn more aout class attributes in Python.
Common Mistake #3: Specifying parameters incorrectly for an exception block
Suppose you have the following code:
The problem here is that the statement does not take a list of exceptions specified in this manner. Rather, In Python 2.x, the syntax is used to bind the exception to the optional second parameter specified (in this case ), in order to make it available for further inspection. As a result, in the above code, the exception is not being caught by the statement; rather, the exception instead ends up being bound to a parameter named .
The proper way to catch multiple exceptions in an statement is to specify the first parameter as a tuple containing all exceptions to be caught. Also, for maximum portability, use the keyword, since that syntax is supported by both Python 2 and Python 3:
Common Mistake #4: Misunderstanding Python scope rules
Python scope resolution is based on what is known as the LEGB rule, which is shorthand for Local, Enclosing, Global, Built-in. Seems straightforward enough, right? Well, actually, there are some subtleties to the way this works in Python, which brings us to the common more advanced Python programming problem below. Consider the following:
What’s the problem?
The above error occurs because, when you make an assignment to a variable in a scope, that variable is automatically considered by Python to be local to that scope and shadows any similarly named variable in any outer scope.
Many are thereby surprised to get an in previously working code when it is modified by adding an assignment statement somewhere in the body of a function. (You can read more about this here.)
It is particularly common for this to trip up developers when using lists. Consider the following example:
Huh? Why did bomb while ran fine?
The answer is the same as in the prior example problem, but is admittedly more subtle. is not making an assignment to , whereas is. Remembering that is really just shorthand for , we see that we are attempting to assign a value to (therefore presumed by Python to be in the local scope). However, the value we are looking to assign to is based on itself (again, now presumed to be in the local scope), which has not yet been defined. Boom.
Common Mistake #5: Modifying a list while iterating over it
The problem with the following code should be fairly obvious:
Deleting an item from a list or array while iterating over it is a Python problem that is well known to any experienced software developer. But while the example above may be fairly obvious, even advanced developers can be unintentionally bitten by this in code that is much more complex.
Fortunately, Python incorporates a number of elegant programming paradigms which, when used properly, can result in significantly simplified and streamlined code. A side benefit of this is that simpler code is less likely to be bitten by the accidental-deletion-of-a-list-item-while-iterating-over-it bug. One such paradigm is that of list comprehensions. Moreover, list comprehensions are particularly useful for avoiding this specific problem, as shown by this alternate implementation of the above code which works perfectly:
Common Mistake #6: Confusing how Python binds variables in closures
Considering the following example:
You might expect the following output:
But you actually get:
This happens due to Python’s late binding behavior which says that the values of variables used in closures are looked up at the time the inner function is called. So in the above code, whenever any of the returned functions are called, the value of is looked up in the surrounding scope at the time it is called (and by then, the loop has completed, so has already been assigned its final value of 4).
The solution to this common Python problem is a bit of a hack:
Voilà! We are taking advantage of default arguments here to generate anonymous functions in order to achieve the desired behavior. Some would call this elegant. Some would call it subtle. Some hate it. But if you’re a Python developer, it’s important to understand in any case.
Common Mistake #7: Creating circular module dependencies
Let’s say you have two files, and , each of which imports the other, as follows:
And in :
First, let’s try importing :
Worked just fine. Perhaps that surprises you. After all, we do have a circular import here which presumably should be a problem, shouldn’t it?
The answer is that the mere presence of a circular import is not in and of itself a problem in Python. If a module has already been imported, Python is smart enough not to try to re-import it. However, depending on the point at which each module is attempting to access functions or variables defined in the other, you may indeed run into problems.
So returning to our example, when we imported , it had no problem importing , since does not require anything from to be defined at the time it is imported. The only reference in to is the call to . But that call is in and nothing in or invokes . So life is good.
But what happens if we attempt to import (without having previously imported , that is):
Uh-oh. That’s not good! The problem here is that, in the process of importing , it attempts to import , which in turn calls , which attempts to access . But has not yet been defined. Hence the exception.
At least one solution to this is quite trivial. Simply modify to import within:
No when we import it, everything is fine:
Common Mistake #8: Name clashing with Python Standard Library modules
One of the beauties of Python is the wealth of library modules that it comes with “out of the box”. But as a result, if you’re not consciously avoiding it, it’s not that difficult to run into a name clash between the name of one of your modules and a module with the same name in the standard library that ships with Python (for example, you might have a module named in your code, which would be in conflict with the standard library module of the same name).
This can lead to gnarly problems, such as importing another library which in turns tries to import the Python Standard Library version of a module but, since you have a module with the same name, the other package mistakenly imports your version instead of the one within the Python Standard Library. This is where bad Python errors happen.
Care should therefore be exercised to avoid using the same names as those in the Python Standard Library modules. It’s way easier for you to change the name of a module within your package than it is to file a Python Enhancement Proposal (PEP) to request a name change upstream and to try and get that approved.
Common Mistake #9: Failing to address differences between Python 2 and Python 3
Consider the following file :
On Python 2, this runs fine:
But now let’s give it a whirl on Python 3:
What has just happened here? The “problem” is that, in Python 3, the exception object is not accessible beyond the scope of the block. (The reason for this is that, otherwise, it would keep a reference cycle with the stack frame in memory until the garbage collector runs and purges the references from memory. More technical detail about this is available here).
One way to avoid this issue is to maintain a reference to the exception object outside the scope of the block so that it remains accessible. Here’s a version of the previous example that uses this technique, thereby yielding code that is both Python 2 and Python 3 friendly:
Running this on Py3k:
(Incidentally, our Python Hiring Guide discusses a number of other important differences to be aware of when migrating code from Python 2 to Python 3.)
Common Mistake #10: Misusing the method
Let’s say you had this in a file called :
And you then tried to do this from :
You’d get an ugly exception.
Why? Because, as reported here, when the interpreter shuts down, the module’s global variables are all set to . As a result, in the above example, at the point that is invoked, the name has already been set to .
A solution to this somewhat more advanced Python programming problem would be to use instead. That way, when your program is finished executing (when exiting normally, that is), your registered handlers are kicked off before the interpreter is shut down.
With that understanding, a fix for the above code might then look something like this:
This implementation provides a clean and reliable way of calling any needed cleanup functionality upon normal program termination. Obviously, it’s up to to decide what to do with the object bound to the name , but you get the idea.
Python is a powerful and flexible language with many mechanisms and paradigms that can greatly improve productivity. As with any software tool or language, though, having a limited understanding or appreciation of its capabilities can sometimes be more of an impediment than a benefit, leaving one in the proverbial state of “knowing enough to be dangerous”.
Familiarizing oneself with the key nuances of Python, such as (but by no means limited to) the moderately advanced programming problems raised in this article, will help optimize use of the language while avoiding some of its more common errors.
You might also want to check out our Insider’s Guide to Python Interviewing for suggestions on interview questions that can help identify Python experts.
We hope you’ve found the pointers in this article helpful and welcome your feedback.
Guest post from Toptal Engineering Blog.
UnboundLocalError: local variable referenced before assignment
So that sucks
Howdya fix that?
I’ll tell you. First, a simple example of the problem:
#This is valid python 2.5 code #My global variable: USER_COUNT = 0 #functions: def Main(): AddUser() def AddUser(): print 'There are',USER_COUNT,'users so far' # actually run Main() Main()
^ That program WORKS FINE. Function AddUser() just references the USER_COUNT variable, which was declared as a global (outside of any of the function blocks).
Here’s where it goes wrong: when we try to write to or we try to update the value of the global variable
#USER_COUNT is a GLOBAL variable USER_COUNT = 0 def Main(): AddUser() def AddUser(): USER_COUNT = USER_COUNT + 1 print 'There are',USER_COUNT,'users so far' Main()
Then we get: UnboundLocalError: local variable ‘USER_COUNT’ referenced before assignment
So that sucks.
The reason this happens is because AS SOON AS YOU WRITE TO A VARIABLE, that variable is AUTOMATICALLY considered LOCAL to the function block in which its declared. Namely:
#USER_COUNT is a GLOBAL! USER_COUNT = 0 def AddUser(): USER_COUNT = USER_COUNT + 1 print 'There are',USER_COUNT,'users so far'
EVEN THOUGH we declared USER_COUNT as a GLOBAL, the simple act of WRITING TO IT __ANYWHERE__ in the function scuzzles-up the “globalness” of the USER_COUNT variable, and like, automatically makes ANY use of USER_COUNT refer to a LOCAL VARIABLE inside of AddUser().
So howdya fix it?
Easy! You do this:
#global USER_COUNT = 0 def Main(): AddUser() def AddUser(): global USER_COUNT ######!!! IMPORTANT !!! Make sure # to use the GLOBAL version of USER_COUNT, not some # locally defined copy of that. I think this # might be a python feature to stop functions from # clobbering the global variables in a program USER_COUNT = USER_COUNT + 1 print 'There are',USER_COUNT,'users so far' Main()
This is an ok-nice feature that might stop a program’s functions from clobbering the globals (since you really have this “just use it” attitude to a variable and you may have no clue that you’re clobbering a global), but really it might be nice if python were more consistent and required use of this global thing for BOTH read/write. Though I guess it could be kinda convenient behavior .. I don’t know yet, haven’t programmed in python for long enough.