The release of Python 3.10 is
expected on October 4, 2021. As
with previous releases, this release will introduce new language features and
constructs. I’m looking forward to one special new feature in the upcoming
release, namely
**Pattern Matching**. If you
have experience in a functional programming language, like Haskell, or a
language with support for programming in a functional
style like Scala (just to name two), pattern matching is
one of the standard items in your toolbox. However, if you are not yet familiar with the
concept or its implementation in
Python, this article is for you. I
will walk you through two examples in increasing complexity to whet your appetite until it’s
finally released.

## It’s not yet released?

How can I demonstrate this new language feature when it’s not yet released? Well, you can get a glimpse of the new feature by using an early version of the release. As of this writing, there are Python 3.10 alpha versions available on Docker Hub. Beta versions and release candidates should follow.

For example, to play with the code in this article interactively, run

```
$ docker run --rm -it python:3.10.0a7-buster
```

## Simple matching

On a superficial level, pattern matching is a glorified `match`

statement,
that looks very similar to the `switch ... case`

statement in Java or C/C++.
You *match* a variable against several *cases* in order, and get the behaviour
defined in the first matching case clause. There is no fall-through in Python.
Consider the following function.

```
def speak(number):
"""Print a translation of number"""
match number:
# Indentation for visual clarity
case 0: print("nothing")
case 1: print("one")
case 2: print("two")
case 3: print("many")
case x if x > 0: print("lots")
case _: print("??")
```

The function uses a single match statement to print a verbal translation of
the given number. The first four cases (`0`

- `3`

) look pretty unsurprising and do
exactly what you would expect for a `switch`

in Java or C/C++. If the input
is, let’s say, `1`

, the function prints “one”. If the input is `2`

, it prints
“two”.

For the `lots`

case, it gets more interesting. Every given number is matched
against the pattern `x`

with an additional condition. Any number matches the
sole name capture `x`

. This means the number is available as variable `x`

for
the remainder of the pattern and the case. The additional condition limits the
execution to positive numbers.
The order of the cases resolves the ambiguity with numbers 0, 1, 2, and 3. The
`nothing`

, `one`

, `two`

, and `many`

cases take precedence over `lots`

.
The `lots`

case matches any number greater or equal to 4.

The last case matches anything and therefore corresponds to a `default`

case
in some languages. If none of the above patterns match the given expression,
the default case is executed. In our case, `speak(-1)`

yields `??`

.

## Structural pattern matching

So far, we’ve only scratched the surface of pattern matching in Python. Pattern matching entails so much more than comparing integers with additional conditions. You can match an expression against composite patterns using built-in classes, like dictionaries, list or dataclasses (new in Python 3.7). In the following example, I will focus on dataclasses and built a simple system to represent and simplify symbolic, arithmetic expressions.

The example is borrowed from the book *Programming in Scala* by Martin Odersky
et al. (Scala has excellent support for pattern matching even extending the
feature set of Python, e.g., see
Extractors).

Let’s start by defining a few dataclasses to represent numbers, variables,
binary operations (as in `a + b`

), and unary operations (as in `-a`

). Each class
is a subclass of the generic `Expression`

class.

```
from dataclasses import dataclass
class Expression:
"""Generic expression"""
@dataclass
class BinaryOp(Expression):
op: str
left: Expr
right: Expr
@dataclass
class UnaryOp(Expression):
op: str
arg: Expr
@dataclass
class Number(Expression):
value: float
@dataclass
class Variable(Expression):
name: str
```

With these classes set up, we are now able to define expressions.

```
a = Variable("a")
b = Variable("b")
c = Variable("c")
# c*c = a*a + b*b
pythagoras = BinaryOp("=", BinaryOp("*", c, c),
BinaryOp("+", BinaryOp("*", a, a),
BinaryOp("*", b, b)))
# (a + 1) * b
score = BinaryOp("*", BinaryOp("+", a, Number(1)), b)
```

Granted, this is not the nicest way to define the expressions. We will deal with this aspect later. For now, it is important to have a data structure to store and represent simple mathematical expressions.

Assume it is your task to implement the function `simplify(expr)`

that takes an
arbitrary expression as input, applies a set of simplification rules, and returns
the result. For example, any occurrence of

- \(+e\) should be replaced with just \(e\),
- \(e + 0\) should be replaced with just \(e\),
- \(0 + e\) should be replaced with just \(e\),
- \(e * 1\) should be replaced with just \(e\),
- \(1 * e\) should be replaced with just \(e\),
- \(-(-e)\) should be replaced with just \(e\),
- \(e + (-f)\) should be replaced with \(e - f\),
- and so on…

where \(e\) and \(f\) are arbitrary expressions. Take a minute and picture in your head how you would write such a function without pattern matching. It would probably involve recursive calls, and I bet it would involve a lot of code like this:

```
if isinstance(left, ...) and left.op == "+" and isinstance(right, ...) ...:
...
elif isinstance(...) and ...:
...
...
```

It would be difficult to code. It would be difficult to read. It would be
difficult to maintain. In short: it would be ugly code. Structural pattern
matching gives us a much cleaner way to implement `simplify()`

. Consider:

```
def simplify(expr):
match expr:
case UnaryOp("+", x): return simplify(x)
case BinaryOp("+", x, Number(0)): return simplify(x)
case BinaryOp("+", Number(0), x): return simplify(x)
case BinaryOp("-", x, Number(0)): return simplify(x)
case BinaryOp("-", Number(0), x): return simplify(x)
case BinaryOp("*", x, Number(1)): return simplify(x)
case BinaryOp("*", Number(1), x): return simplify(x)
case BinaryOp("*", x, Number(0)): return Number(0)
case BinaryOp("*", Number(0), x): return Number(0)
case BinaryOp("+", x, UnaryOp("-", y)):
return simplify(BinaryOp("-", x, y))
case BinaryOp("-", x, UnaryOp("-", y)):
return simplify(BinaryOp("+", x, y))
case UnaryOp("-", UnaryOp("-", x)):
return simplify(x)
case UnaryOp("-", UnaryOp("+", x)):
return simplify(UnaryOp("-", x))
case UnaryOp("+", UnaryOp("-", x)):
return simplify(UnaryOp("-", x))
case UnaryOp(op, x):
return UnaryOp(op, simplify(x))
case BinaryOp(op, x, y):
return BinaryOp(op, simplify(x), simplify(y))
case _: return expr
```

The above implementation is relatively concise and covers even more simplification rules than were initially required, but how does it work?

In the simple integer matching example above, we saw that a pattern could
be a number (optionally with a condition). The pattern syntax, however, is
much more expressive. We can use our dataclasses and just **write what we
would like to match**:

The first case `case UnaryOp("+", x):`

matches any expression that is a unary
expression of the `"+"`

operator. The argument is arbitrary. The `x`

in the
pattern is a name capture that makes the operator’s argument available
within the case statement. For the expression \(+x\), we take the
argument `x`

and pass it recursively to `simplify()`

.

The way the patterns are specified closely resembles what we specified in the
initial set of requirements for the function `simplify()`

. Let’s take the
mathematical rule \(e + 0\), for example. This is translated to the pattern
`BinaryOp("+", x, Number(0))`

in Python. There is nothing magical about this
pattern. It is the formulation of \(e + 0\) using our dataclasses.
The name capture `x`

encodes the fact that \(e\) is an arbitrary expression.

We see in this example also that the structural pattern matching is not
limited to a single class and its arguments. The `right`

argument is yet
another dataclass instance, `Number(0)`

, representing the number \(0\). In this case it looks
trivial, but the concept can be applied to more complex patterns.

Consider the implementation for the \(e + (-f)\) rule. We want to match a sum
(i.e. `BinaryOp("+", left, right)`

) where the right operand is the negation of
an expression (i.e. `UnaryOp("-", arg)`

) the corresponding case statement with
it’s pattern is then

```
case BinaryOp("+", x, UnaryOp("-", y)):
... # x and y are available
```

## Wrap up

You’ve seen two simple examples that showcased two specific features of pattern matching in Python:

- Simple patterns with conditions and
- Structural patterns with dataclasses.

Besides these two features, the specification is much more extensive. I recommend having a look at that document.

Earlier, I’ve promised that we will make defining expressions from the second
example more convenient. We could define overloaded operator methods in the
`Expression`

class. Additionally, the function `parse()`

converts
numbers to `Number`

instances and strings to `Variables`

.

```
class Expression:
def __add__(self, other):
return BinaryOp("+", self, parse(other))
def __sub__(self, other):
return BinaryOp("-", self, parse(other))
def __mul__(self, other):
return BinaryOp("*", self, parse(other))
def __neg__(self):
return UnaryOp("-", self)
def parse(expr):
match expr:
case int(a) | float(a): return Number(a)
case str(n): return Variable(n)
case _: return expr
```

The two expressions from above thus become:

```
a = Variable("a")
b = Variable("b")
c = Variable("c")
pythagoras = BinaryOp("=", c*c, a*a + b*b)
# (a + 1) * Δ
score = (a + 1) * "delta"
```

Much nicer, right?

We haven’t actually used the `simplify()`

function. This is
left as an exercise to the reader. What does `simplify((Variable("a") + 1) * 0)`

return?