Janis Lesinskis' Blog

Assorted ramblings

  • All entries
  • About me
  • Projects
  • Economics
  • Misc
  • Software-engineering
  • Sports

Default parameters and sentinel objects


One of the thing you may wish to do with Python is to set a default for a parameter in a function you define.

The really simple way of doing this is just to define the value you want directly in the parameters:

def foo(x=5):
    print(f"x was {x}")

So far so good!

This works well because the value 5 is immutable, but there's a number of situations with other default values that can cause some issues.

The mutable parameter issue

Let's say we have some sort of line of business enabling function and there's 3 really common products (foo, bar and baz) that we want to have a default behavior for but also the ability to have a look at some other product as well.

def problematic(other_product=None, product_names=["foo", "bar", "baz"]):
    if other_product is not None:
        product_names.append(other_product)
    print(product_names)

So in this case we have a function that will allow us to print out some information about a list of product names.

What do you expect to see when you run the following:

>>> problematic(other_product="quux"), problematic()

Take a moment to think what this will do then have a go at running it.

Did the results surprise you?

The first time I saw this behavior it came as a surprise to me because I Python wasn't the first programming language I had used and I was used to different behavior from other languages.

Here's what happens when you run this in Python 3.6:

>>> problematic(other_product="quux"), problematic()
['foo', 'bar', 'baz', 'quux']
['foo', 'bar', 'baz', 'quux']
(None, None)

What's happened here is that product_names is a list (which is mutable), but that variable belongs to the outer scope in which the function is defined, not internally to the function itself.

We can verify that this is the case by using id which is guaranteed to be the same if both variables are the same underlying object:

def problematic(other_product=None, product_names=["foo", "bar", "baz"]):
    print("id(product_names)", id(product_names))
    if other_product is not None:
        product_names.append(other_product)
    print(product_names)

Then we can run this:

>>> problematic(other_product="quux"), problematic()
id(product_names) 139760843633096
['foo', 'bar', 'baz', 'quux']
id(product_names) 139760843633096
['foo', 'bar', 'baz', 'quux']
(None, None)

And this is why we have the so called "Python mutable default parameter issue", when we change product_names the default parameter in all subsequent functions changes as well because its actually the same variable used in every function.

Default parameters with mutable types

Sometimes you want to have a default parameter that is a mutable type like a list. As we saw before we can't do this simply by placing the default for that parameter inline in the function definition so we have to go about it another way.

Perhaps one of the most common idioms is to set the default parameter value to None then check against that inside the function. The reason this is works is because when you don't supply a parameter Python will assign it the value of None in the function call:

DEFAULT_PRODUCTS = ["foo", "bar", "baz"]
def less_problematic(other_product=None, product_names=None):
    if product_names is None:
        product_names = DEFAULT_PRODUCTS
    print("id(product_names)", id(product_names))
    if other_product is not None:
        product_names.append(other_product)
    print(product_names)

But this still has a subtle bug:

>>> less_problematic(other_product="quux"), less_problematic()
id(product_names) 139760843663880
['foo', 'bar', 'baz', 'quux']
id(product_names) 139760843663880
['foo', 'bar', 'baz', 'quux']
(None, None)

We have just shifted the problem one step of abstraction away here and not actually solved it. As you can see product_names is still shared between the functions so the bug is not fixed. The mutable parameter issue is actually due to the left hand side of the assignment of the default not the right.

What happens here is that on the first run of the function product_names is None so DEFAULT_PRODUCTS is assigned to product_names but then on the second run product_names which was shared from the last function call is no longer None so it gets reused. We have to prevent the reuse of the variable to avoid this bug.

Another crucial issue here is that we can't actually distinguish if a parameter was supplied by the user or not when using the default None parameter. First I'll show you a way to allow a user to supply a default value then we will come back to how to fix this mutability issue.

Figuring out if a user supplied a parameter

As you can see before if we use None as a default parameter we aren't actually able to determine if a user supplied that parameter None themselves or not. Take the following for example:

def parameter_example(param=None):
    if param is None:
        print("Using the default here")

Which gives the following:

>>> parameter_example()
Using the default here
>>> parameter_example(None)
Using the default here
>>> parameter_example(param=None)
Using the default here

As we can see in the second and third cases we supplied param as an argument but the branch in the function has no way of actually being able to deduce the difference between these two types of function calls and the one with no argument supplied1. This is because Python substitutes the argument None whenever a user doesn't supply an argument but the user could supply None themselves and the parameter_example function wouldn't be able to tell the difference between those two call site cases.

As a side note I strongly recommend using keyword-only arguments where possible to avoid this ambiguity between cases 2 and 3 here. (Similarly using positional only arguments in Python 3.8 and newer to disambiguate when you want to use a positional argument exclusively)

If we want to give the user a way to explicitly declare their intention to use the default in their calling site code the easiest way is to use a sentinel object:

DEFAULT=object()
def parameter_sentinel_example(param=None):
    if param is DEFAULT:
        print("Using the default here")
    else:
        print("Not using the default")
>>> parameter_sentinel_example()
Not using the default
>>> parameter_sentinel_example(DEFAULT)
Using the default here
>>> parameter_sentinel_example(param=DEFAULT)
Using the default here

What we have here is a reliable way to tell if the user wanted to explicitly use the default value, this allows the default value to change over time but have the calling site explicitly use this default. The sentinel object is useful here because it can't accidentally be confused with regular data since object() doesn't provide any of the operators that real data would use but it does have the ability to be checked for identity.

If you'd like more details on the implementation of how equality vs identity works, have a look at the poster I presented at PyCon on this topic.

Bringing it all together: a mutable default parameter

What we have to do is to make sure we copy the default values any time we want to use the defaults rather than reference the same variable multiple times:

import copy
DEFAULT_PRODUCTS = ["foo", "bar", "baz"]
USE_DEFAULT = object()
def mutable_default_example(other_product=None, product_names=USE_DEFAULT):
    if product_names is USE_DEFAULT:
        product_names = copy.deepcopy(DEFAULT_PRODUCTS)
    print("id(product_names)", id(product_names))
    if other_product is not None:
        product_names.append(other_product)
    print(product_names)

Running this:

>>> mutable_default_example(other_product="quux"), mutable_default_example()
id(product_names) 139760843633416
['foo', 'bar', 'baz', 'quux']
id(product_names) 139760843633416
['foo', 'bar', 'baz']
(None, None)

As you can see this allows you to use a mutable data type in the function code and also provide a default value.


  1. A while ago I tried to figure out if there was some way of consistently detecting the calling structure from existing code using only application level code. I wanted to be able to determine how a function was called, for example did an argument come from a keyword or a positional? Did the calling site provide the argument or was it a default? Unfortunately I couldn't quite figure out a reliable way of doing this. I'm not sure it's possible without recompiling a customized version of the Python interpreter that would carry over calling information. You probably wouldn't want to run something like this in production because it would make Python's already slow function calling overhead even slower, but it would allow you to identify which function call signatures could be safely refactored to keyword-only arguments and similar. ↩

Published: Tue 18 August 2020
By Janis Lesinskis
In Software-engineering
Tags: Python software-engineering

links

  • JaggedVerge

social

  • My GitHub page
  • LinkedIn

Proudly powered by Pelican, which takes great advantage of Python.