Prevent accidental HTTP requests to 3rd party services in Python tests

The world is full of web services which interact with each other via HTTP(S) requests and responses. Python is one of the most loved technologies when it comes to fetching data from web, e.g. requests is downloaded close to 70 million (!) times per month from PyPI. If you have played with Python for more than two dark nights, you have probably written some sort of HTTP client implementation or used some library which makes HTTP requests under the hood. Hopefully you also wrote a set of tests for your master piece. If you have written tests for functionality which triggers HTTP requests, you probably have felt the pain with mocking / patching. Sometimes you just don't remember to mock all the things you should and end up doing accidental HTTP requests while running the tests. In this blog post, I'll present a simple solution targeting this problem.

The problem

Have you sent real emails or sms messages in your tests by accident? Or have you accidentally POSTed some foo bar test data to a production environment of a 3rd party web service while running your test suite? I have. I bet I'm not the only one. If you're unsure whether your tests could be doing something similar, try running the test suite without internet connection and see if something weird occurs.

Python provides all the goodies for mocking basically anything during tests. So, what's the problem? We are human, we make mistakes. Sometimes developers are unaware of all side effects a code path might have. On the other hand, occasionally developers are just sloppy or lazy. Of course we should have a test specific configuration which at least ensures that we are not calling real services if we are accidentally calling something. However, if there's a mishap in the test configuration, even then we should not let accidental HTTP requests towards 3rd party services through.

When writing tests for our Python applications and libraries, we have to be aware of all the potential side effects, such as HTTP requests, which may be triggered in the code execution path under test. This is not always trivial. Especially, when writing integration / end-to-end tests, it's hard to keep track of all potential side effects. Even if you're completely familiar with your own code, it's almost impossible to stay up-to-date with all the potential side effects of the 3rd party libraries and frameworks that your code depends on. This gets harder when the complexity of the codebase increases.

The solution

Let's consider a simple (and rather naive) HTTP client implementation which offers a method for fetching a Wikipedia page:

from urllib.parse import urljoin

import requests


class WikiClient:
    def __init__(self, base_url='https://en.wikipedia.org/wiki/'):
        self._base_url = base_url
        self._session = requests.session()

    def fetch_page(self, page_name) -> requests.Response:
        url = urljoin(self._base_url, page_name)
        return self._session.get(url)

Then let's add a naive (pytest) test case for it:

def test_wiki_client_fetch_page():
    wiki = WikiClient()
    
    response = wiki.fetch_page('Python_(programming_language)')
    
    assert response.status_code == 200
    assert 'Guido van Rossum' in response.text

This test will pass but it has one fundamental problem: it'll make an HTTP request to wikipedia. In most cases, we want our tests to run in isolation. For example here, we probably don't want to hammer wikipedia.org in our tests.

Okay, the problem has been identified and we can easily fix it for example by mocking the requests.session().get or by using responses. However, we may have already run the test multiple times before we notice the issue. In this naive example it's harmless but in some real world use cases it can be a completely different story. So, how could we make sure that we don't accidentally make those HTTP requests when we forget to mock some crucial part of the functionality under test? We can mock HTTP requests for all the tests:

import pytest


@pytest.fixture(autouse=True)
def no_http_requests(monkeypatch):
    def urlopen_mock(self, method, url, *args, **kwargs):
        raise RuntimeError(
            f"The test was about to {method} {self.scheme}://{self.host}{url}"
        )

    monkeypatch.setattr(
        "urllib3.connectionpool.HTTPConnectionPool.urlopen", urlopen_mock
    )

As you can see, it's an autouse pytest fixture which will raise a RuntimeError when urlopen is called. It can be placed e.g. in the root level conftest file. The exception will contain information about the request that was about to be executed. For example, in the case of test_wiki_client_fetch_page, the test would fail with:

RuntimeError: The test was about to GET https://en.wikipedia.org/wiki/Python_(programming_language)

The fixture relies on the fact that all HTTP requests eventually go through urllib3.connectionpool.HTTPConnectionPool.urlopen. This is of course an assumption which does not apply to 100% of the implementations out there in the wild. However, it's a fair assumption for the majority of use cases. For example, requests relies on urllib3 in its low level implementation which means that the fixture is compatible with all the client code which uses requests. If your code or your dependencies use some other magic for making HTTP requests, similar approach can be used for mocking those.

If there's a need for allowing HTTP requests towards certain hosts, the fixture can be extended:

import pytest
from urllib3.connectionpool import HTTPConnectionPool
    
    
@pytest.fixture(autouse=True)
def no_http_requests(monkeypatch):
    
    allowed_hosts = {'localhost'}
    original_urlopen = HTTPConnectionPool.urlopen
    
    def urlopen_mock(self, method, url, *args, **kwargs):
        if self.host in allowed_hosts:
            return original_urlopen(self, method, url, *args, **kwargs)
        
        raise RuntimeError(
            f"The test was about to {method} {self.scheme}://{self.host}{url}"
        )

    monkeypatch.setattr(
        "urllib3.connectionpool.HTTPConnectionPool.urlopen", urlopen_mock
    )

Further, if we want to have the fixture as session scoped to reduce overhead, we can make a custom session scoped monkeypatch fixture. (While writing this, the one provided by pytest supports only function scope.)

import pytest
from _pytest.monkeypatch import MonkeyPatch


@pytest.fixture(scope='session')
def monkeypatch_session():
    m = MonkeyPatch()
    yield m
    m.undo()
    

@pytest.fixture(autouse=True, scope='session')
def no_http_requests(monkeypatch_session):
    ...

One of the benefits no_http_requests fixture provides is that it does the monkeypatching in low level. This basically means that it does not limit the use of mocking / patching facilities on higher level. For example, we can still utilize e.g. responses in our tests:

import responses


@responses.activate
def test_wiki_client_fetch_page():
    responses.add(
        responses.GET,
        "https://en.wikipedia.org/wiki/Python_(programming_language)",
        body="Guido van Rossum likes Monty Python quite a bit.",
    )
    wiki = WikiClient()
    
    response = wiki.fetch_page("Python_(programming_language)")
    
    assert response.status_code == 200
    assert "Guido van Rossum" in response.text

Nonetheless, the biggest benefit is that we don't execute HTTP requests by accident! If we have no_http_requests in place, we'll get immediate feedback if our test is going to perform an HTTP request. I argue it's also a productivity boost as developers don't need to double check what they should be mocking before running a test which may trigger HTTP requests.

Pleasant coding and testing!

Jerry Pussinen

Jerry Pussinen