Sneak peek at Python 3.8

Python 3.8 release is scheduled for November this year (2019). The first beta release - no new features after this one - got out a while ago. I decided to have a look at what's coming and played around with some of the new features which seemed interesting.

➜ ~ python --version
Python 3.8.0b1

Assignment expressions

Aka "The Walrus Operator" (:=). There will be a dedicated chapter for the PEP 572 (assignment expressions) in the Python history books: Guido van Rossum, the father of Python, stepped down (or took a "permanent vacation") from his BDFL throne after heated conversations around PEP 572. Some could argue that PEP 572 has been one of the most controversial PEPs in the history of the language. This event led to the governance model in which the power of PEPs related decision making is on a steering council of 5 people.

Let's see what all the fuss is about. The basic idea is that an expression can be named. Here are some examples:

while val := some_func():
    # do something with val

if val := some_func():
    # do something with val

# Works even in comprehensions
multiplies = [val := some_func(), val * 2, val * 3]

This essentially let's you write code in more concise manner. Consider you have something like this prior 3.8:

val = func1()
if not val:
    logging.warning('func1 no good')
    val = func2()
    if not val:
        logging.warning('func2 no good')
        val = func3()
if not val:
    raise ValueError('All went south')
    
# Do something with val after

After upgrading to 3.8, you could refactor it to:

if not (val := func1()):
    logging.warning('func1 no good')
    if not (val := func2()):
        logging.warning('func2 no good')
        if not (val := func3()):
            raise ValueError('All went south')

# Do something with val after

Let's put aside the nested nature + other ugliness of this beast and just focus on the LOCs: it's 9 (prior 3.8) vs 6 (3.8+) lines. It's not rare to see patterns like this. One example mentioned in the PEP's documentation is matching against regular expressions: try one and proceed to the next if it does not match.

Overall, I have a bit mixed feelings about this. Surely the syntax is somewhat confusing at the beginning but I guess it'll feel natural after some tinkering. My personal concern is more on the side of new comers: every new syntactic sugar thingy moves Python further away from "the language which you can read like English". Of course some of the neat (confusing)  tricks can be hidden from the newcomers at the beginning of their learning journey but I still feel that assignment expressions might be reducing the overall readability of the language. On the other hand, one can probably refactor her 10-liner script into a 8-liner (accurate average pulled out of a hat) after 3.8 is out, which is absolutely a positive aspect of the feature. Python will be even more compact.

Positional-only arguments

The basic idea is that one can use / in a function definition to indicate that arguments before that are positional-only. If the caller uses keyword argument when positional-only is required, the caller will be greeted with a TypeError.

Here's a minimal example:

def func(a, /):
    pass

func('foobar')
func(a='foobar') # This raises

I'd say this is kind of an extension for already existing, yet not widely known nor used, keyword-only arguments. It's a bit surprising that positional-only arguments were not introduced at the same time as keyword-only. However, now that also positional-only will be supported, the function/method signatures start to be "feature complete". One of the advantages for these *-only arguments is to provide library authors a gentle way to enforce the intended calling behavior of the API that they provide. When used wisely, *-only arguments will increase the readability of the API both in the API implementation and in caller code. One advantage of positional-only arguments is that library authors can freely change the parameter names without breaking backwards compatibility.

Here's an example function definition which includes all argument types:

def func(a, /, b, *, c):
    pass

In which,
a: positional-only
b: positional or keyword
c: keyword-only

Thus, func can be called like this:

func('foo', 'bar', c='foobar')

func('foo', b='bar', c='foobar')

But not like this:

func(a='foo', b='bar', c='foobar') # TypeError: func() got some positional-only arguments passed as keyword arguments: 'a'

func('foo', 'bar', 'foobar') # TypeError: func() takes 2 positional arguments but 3 were given

I would assume that positional-only arguments will be most valuable and used in libraries which handle some sort of mathematical operations. If one uses parameter names such as a, b, or similar in their function definitions, it's likely a sign of a potential use case of positional-only.

Runtime audit hooks

This one clearly targets for security purposes. On the other hand, I also believe it'll bring a set of use cases that were not necessarily foreseen by the core development team. People tend to be creative when they get new toys to play with.

Here's a simple example:

import sys

import requests


def audit(event, args):
    if 'socket' in event:
        if event == 'socket.connect':
            raise RuntimeError('Alright, this is too much')
        print(f'Captain, some networking going on: event={event}, args={args}')
    else:
        print(f'Not very interesting event: {event}')


def do_something_nasty():
    return requests.get('https://en.wikipedia.org/wiki/Python_(programming_language)')


sys.addaudithook(audit)
do_something_nasty()

Let's run it:

Not very interesting event: import
Not very interesting event: open
Not very interesting event: exec
Not very interesting event: import
Not very interesting event: open
Not very interesting event: exec
Captain, some networking going on: event=socket.gethostbyname, args=('en.wikipedia.org',)
Captain, some networking going on: event=socket.getaddrinfo, args=('en.wikipedia.org', 443, 0, 1, 0)
Captain, some networking going on: event=socket.__new__, args=(<socket.socket fd=-1, family=AddressFamily.AF_UNSPEC, type=0, proto=0>, 30, 1, 6)
Traceback (most recent call last):
    .
    .
    .
    raise RuntimeError('Alright, this is too much')
RuntimeError: Alright, this is too much
Not very interesting event: cpython._PySys_ClearAuditHooks

While writing this, the built-in events triggered by the CPython runtime are not yet documented but the documentation placeholder exists here. PEP 578 mentions the obvious downside of using the audit hooks: Once a developer has added audit hooks they have explicitly chosen to trade performance for functionality.

There's also a possibility to use custom events:

import sys


def audit(event, args):
    if event == 'my-event':
        print(f'audit: {event} with args={args}')


sys.addaudithook(audit)

sys.audit('my-event', 'foo', 'bar', 123)

# Output:
# audit: my-event with args=('foo', 'bar', 123)

The intended use cases clearly gravitate towards security. Once 3.8 is out, I expect to see a number of new security related tools/libraries which will utilize the new built-in audit hooks feature heavily. Perhaps some of the existing libraries also add an opt-in feature for auditing events triggered by the libraries themselves.

Wrap-up

My two cents: positional only arguments and runtime audit hooks are likely something that an average Python user will not use (nor see being used) while assignment expressions feature is going to have more notable impact. I feel that at least the data science part of the Python community is pleased about the assignment expressions as similar functionality is present in R as well.