Janis Lesinskis' Blog

Assorted ramblings

  • All entries
  • About me
  • Projects
  • Economics
  • Misc
  • Software-engineering
  • Sports

Type annotations in Python 3.7


Forward declarations of types in Python used to be a pain, but now it's much easier. The ability to do static analysis of types is a relatively new addition to the Python language. Some history helps explain why things were the way they were and why the new behavior that will be the default in Python 3.10 is what it is.

Some history

If you are working on production systems written in Python where code correctness is important type annotations are maybe one of the most important features that have been introduced to the language. The ability to add annotations inline to functions was added in Python 3.0 after PEP 3107 was accepted in 2006. This allowed you to add some extra metadata that would then get stored in __annotations__. Interestingly enough this never explicitly stated that there were any standard semantics for these annotations and one of the examples from the PEP 3017 document looks very strange now:

def compile(source: "something compilable",
            filename: "where the compilable thing comes from",
            mode: "is this a single statement or a suite?"):
    ...

What happens is that at the time the function is defined compile.__annotations__ will be assigned the dictionary of values with the arguments as keys and the associated annotations as the values. While this is still legal Python code it looks quite strange in the era since PEP 484 where a standard semantics was introduced for type hints. A core component of PEP 484 was the introduction the typing module in Python 3.5. The typing module gives you the ability to define all the fundamental building blocks for Python types. From Python 3.5 onwards Type checking your code was feasible but it wasn't until the development of mypy type checking system along with the creation of extensive type stubs for the standard library via the typeshed project that type annotations got more powerful and hence more widely used. Another reason it took a while for type annotations to become mainstream was that a number of annoyances existed that only really went away when new language features were introduced.

For example in Python 3.5 then there was still a major shortcoming where annotations were initially only available on functions. If you wanted to use a tool like mypy to annotate the types of variables you had to do it in a comment. But this had a very annoying shortcoming with situations where you wanted to specify only the type but not initialize the variable since introducing a new symbol meant you had to initialize it in version of Python 3.5 and earlier.

The main situation in which I found this particularly annoying were situations were you wanted to annotate a class variable but delay initialization until later (perhaps in __init__ or __new__). In Python 3.6 you can do:

class ExampleFile:
    base_path: str
    def __init__(self, filename:str, base_path:str="/examples") -> None:
        self.filename = filename
        self.base_path = base_path

You can't make this sort of annotation for base_path in a comment because the rules of the language syntax meant that you had to initialize such a variable. The ability to annotate variables in this way without initializing them was introduced PEP 526 which was implemented in Python 3.6.

Forward declarations of types

Because type annotations were evaluated at the time a function is defined one major annoyance has been the problem of forward declarations, for example consider:

class Node:
    """Binary tree node."""

    def __init__(self, left: Node, right: Node) -> None:
        self.left = left
        self.right = right

Because Python class names used to become available only after the body of the class is executed there was no way to easily refer to Node inside the definition of the class Node. See for example what happens if you try to run something like this in Python 3.5/3.6:

$ python3.5
Python 3.5.2 (default, Apr 16 2020, 17:47:17) 
[GCC 5.4.0 20160609] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> class Node:
...     def __init__(self, left: Node, right: Node) -> None:
...         self.left = left
...         self.right = right
... 
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 2, in Node
NameError: name 'Node' is not defined

To deal with this in Python 3.5 and 3.6 you had to mess around with type annotations as strings:

class Node:
    """Binary tree node."""

    def __init__(self, left: 'Node', right: 'Node') -> None:
        self.left = left
        self.right = right

    def get_left(self) -> Node:
        return self.left

PEP 0563 introduces changes that will allow you to postpone the evaluation of annotations. This behavior will be the new default in Python 3.10. Because this is a backwards incompatible change if you want access to this in Python 3.7 you need to use a __future__ statement:

$ python3.7
Python 3.7.4 (default, Aug 13 2019, 20:35:49) 
[GCC 7.3.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from __future__ import annotations
>>> class Node:
...     def __init__(self, left: Node, right: Node) -> None:
...         self.left = left
...         self.right = right
... 

As you can see this now works and this will be the default behavior in python 3.10 and on.

Published: Tue 31 August 2021
By Janis Lesinskis
In Software-engineering
Tags: python mypy tooling linting type-systems API-design static-analysis static-typing

links

  • JaggedVerge

social

  • My GitHub page
  • LinkedIn

Proudly powered by Pelican, which takes great advantage of Python.