To eager or not to eager - how to properly test your Celery tasks

March 2, 2021 •

• ☕️ 4 min read

Eager mode in testing

Some time ago I run into a problem. In the project I was working on task_always_eager has been used in the test suite, but I needed to store task results in the backend and it was not possible (pre 5.1.0).

publish_result = not eager and not ignore_result

The reason Celery is not storing eager task results in the backend is quite trivial. Eager tasks are executed locally without being send to the queue so it makes sense the result is not stored in the result storage. The task is executed directly by the caller. So, in a normal situation, it indeed is not desired to keep the result saved on the backend. But what about testing?

Celery docs mention

The eager mode enabled by the task_always_eager setting is by definition not suitable for unit tests.

When testing with eager mode you are only testing an emulation of what happens in a worker, and there are many discrepancies between the emulation and what happens in reality.

Although I agree with the premise, I don’t think we should never use eager mode when testing. Of course, that way we don’t test the actual Celery worker. We skip a huge part of the worker setup which is never desired.

On the other hand, eager tasks allow us to test the body of a task, and, in reality, that’s often all we need, as we need to test the business logic, and if the ops is already set up and is working fine, chances are everything will work as expected.

My solution

As I mentioned, I needed tests results to be available in the backend with eager mode. Reason is, I treated the results as sort of a cache.

As I was using Django, I added django-celery-results (which also is maintained by Celery team ) to my project.

In ticket #49 from 2018 a few were complaining about not being able to save the result to the backend with eager mode. Even though some hacks were suggested, I haven’t found the solution I needed there.

Then I found these two stackoverflow questions, or actually - answers, that opened my eyes. Sovetnikov mentiones it’s not possible to achieve it with eager mode. Celery worker has to be run as in production. In order to do that, you need to start a Celery worker in the subprocess. Before I started to think about the implementation I found drhagen answer with a working example in Django (thank you! 🙇‍♂️). I adjusted his example to my needs so the worker was starting once in the test runner rather than in each test case.

from django.test.runner import DiscoverRunner
from celery.contrib.testing.worker import start_worker
import logging


logger = logging.getLogger(__name__)


class CeleryTestRunner(DiscoverRunner):
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)
        self.worker = None

    def setup_celery(self):
        # your django app named celery where `app` is a worker instance
        from celery import app

        self.worker = start_worker(
            app=app,
            perform_ping_check=False,
            concurrency=1,
            pool="solo",
            loglevel=logging.INFO,
        )
        self.worker.__enter__()

    def teardown_celery(self):
        logger.info("Stopping celery")
        self.worker.__exit__(None, None, None)

    def setup_databases(self, **kwargs):
        o = super().setup_databases(**kwargs)
        self.setup_celery()
        return o

    def teardown_databases(self, old_config, **kwargs):
        super().teardown_databases(old_config, **kwargs)
        self.teardown_celery()

and in settings I set it as the default test runner

TEST_RUNNER = "yourapp.test_runner.CeleryTestRunner"

It was this simple, fire and forget! Or at least that’s what I thought…

As the original author says on SO

Because Django-Celery involves some cross-thread communication, only test cases that don’t run in isolated transactions will work. The test case must inherit directly from SimpleTestCase or its Rest equivalent APISimpleTestCase and set the class attribute allow_database_queries to True.

The tradeoff is huge. For those unaware, *Simple* test cases work very differently than regular test cases. Transactions are handled in a way that WILL RESULT in a state of the DB that you might not expect. Transactions are not rollbacked, DB is dirty. After some experimenting I run into test isolation problems. And it’s something I never ever want to deal with. If there’s a possibility that one test might affect another - the tests are basically worthless because you can’t be sure if you are testing the actual scenario.

My second (final) solution

I came to the conclusion that task_always_eager is the only sane solution that fits my needs. If it can’t save results to the backend, I have to make it save the results.

I opened a ticket in Celery, wrote a PR. It was reviewed, accepted and merged. That’s basically it. 🎉

As of 5.1.0, warning related to testing with Celery has changed slightly. Now it says

The eager mode enabled by the task_always_eager setting is by definition not suitable for unit tests.

When testing with eager mode you are only testing an emulation of what happens in a worker, and there are many discrepancies between the emulation and what happens in reality.

Note that eagerly executed tasks don’t write results to backend by default. If you want to enable this functionality, have a look at task_store_eager_result.

And when you click on it, you will see

If this is True and task_always_eager is True and task_ignore_result is False, the results of eagerly executed tasks will be saved to the backend.

and all I need in my Django test settings is

CELERY_TASK_ALWAYS_EAGER = True
CELERY_TASK_STORE_EAGER_RESULT = True