Saturday, September 19, 2009

Mutable objects as default arguments in Python

I spent quite some time Friday morning trying to figure out a weird problem. The problem was not there when executing code in command line in a production machine or in a local environment.

It turned out that I made a mistake that I had never known. I had a function looks like
def a(m=[]):
  return m
If you call a() three times, you get
[1 1]
[1 1 1]
which was not what I expected. I used m as a flag to select algorithms, and the function a() was called exactly once in every process. In a single thread process, everything seems ok. But in a multi-threading environment, once a flag is set, the flag will be there for all subsequent calls.

I checked Python documents and found that "the default value is evaluated only once." That's a potential problem if you use list or dict as a default argument in Python. It essentially says that a list or a dict as default is taken as a static object. If the argument is read-only, you are fine. If you want to overwrite the argument, make sure you intend to use the list or dict as a static object. For me, I rewrote my function like
def a(m=[]):
  m = m + [1]
  return m
so m is pointed to a new list object. Problem solved. Lesson learned.

No comments: