To eager or not to eager - how to properly test your Celery tasks
• • ☕️ 4 min readEager mode in testing
Some time ago I run into a problem. In the project I was working on task_always_eager
has been used in the test suite, but I needed to store task results in the backend
and it was not possible (pre 5.1.0).
publish_result = not eager and not ignore_result
The reason Celery is not storing eager task results in the backend is quite trivial. Eager tasks are executed locally without being send to the queue so it makes sense the result is not stored in the result storage. The task is executed directly by the caller. So, in a normal situation, it indeed is not desired to keep the result saved on the backend. But what about testing?
Celery docs mention
The eager mode enabled by the
task_always_eager
setting is by definition not suitable for unit tests.
When testing with eager mode you are only testing an emulation of what happens in a worker, and there are many discrepancies between the emulation and what happens in reality.
Although I agree with the premise, I don’t think we should never use eager
mode when testing. Of course, that way
we don’t test the actual Celery worker. We skip a huge part of the worker setup which is never desired.
On the other hand, eager
tasks allow us to test the body of a task, and, in reality, that’s often all we need, as we need to test the business logic, and if the ops is already set up and is working fine, chances are everything will work as expected.
My solution
As I mentioned, I needed tests results to be available in the backend with eager mode. Reason is, I treated the results as sort of a cache.
As I was using Django, I added django-celery-results (which also is maintained by Celery team ) to my project.
In ticket #49 from 2018 a few were complaining about not being able to save the result to the backend with eager mode. Even though some hacks were suggested, I haven’t found the solution I needed there.
Then I found these two stackoverflow questions, or actually - answers, that opened my eyes. Sovetnikov mentiones it’s not possible to achieve it with eager mode. Celery worker has to be run as in production. In order to do that, you need to start a Celery worker in the subprocess. Before I started to think about the implementation I found drhagen answer with a working example in Django (thank you! 🙇♂️). I adjusted his example to my needs so the worker was starting once in the test runner rather than in each test case.
from django.test.runner import DiscoverRunner
from celery.contrib.testing.worker import start_worker
import logging
logger = logging.getLogger(__name__)
class CeleryTestRunner(DiscoverRunner):
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
self.worker = None
def setup_celery(self):
# your django app named celery where `app` is a worker instance
from celery import app
self.worker = start_worker(
app=app,
perform_ping_check=False,
concurrency=1,
pool="solo",
loglevel=logging.INFO,
)
self.worker.__enter__()
def teardown_celery(self):
logger.info("Stopping celery")
self.worker.__exit__(None, None, None)
def setup_databases(self, **kwargs):
o = super().setup_databases(**kwargs)
self.setup_celery()
return o
def teardown_databases(self, old_config, **kwargs):
super().teardown_databases(old_config, **kwargs)
self.teardown_celery()
and in settings I set it as the default test runner
TEST_RUNNER = "yourapp.test_runner.CeleryTestRunner"
It was this simple, fire and forget! Or at least that’s what I thought…
As the original author says on SO
Because Django-Celery involves some cross-thread communication, only test cases that don’t run in isolated transactions will work. The test case must inherit directly from
SimpleTestCase
or its Rest equivalentAPISimpleTestCase
and set the class attributeallow_database_queries
toTrue
.
The tradeoff is huge. For those unaware, *Simple*
test cases work very differently than regular test cases. Transactions are handled in a way that WILL RESULT in a state of the DB that you might not expect. Transactions are not rollbacked, DB is dirty. After some experimenting I run into test isolation problems. And it’s something I never ever want to deal with. If there’s a possibility that one test might affect another - the tests are basically worthless because you can’t be sure if you are testing the actual scenario.
My second (final) solution
I came to the conclusion that task_always_eager
is the only sane solution that fits my needs. If it can’t save results to the backend, I have to make it save the results.
I opened a ticket in Celery, wrote a PR. It was reviewed, accepted and merged. That’s basically it. 🎉
As of 5.1.0, warning related to testing with Celery has changed slightly. Now it says
The eager mode enabled by the
task_always_eager
setting is by definition not suitable for unit tests.
When testing with eager mode you are only testing an emulation of what happens in a worker, and there are many discrepancies between the emulation and what happens in reality.
Note that eagerly executed tasks don’t write results to backend by default. If you want to enable this functionality, have a look at
task_store_eager_result
.
And when you click on it, you will see
If this is
True
andtask_always_eager
isTrue
andtask_ignore_result
isFalse
, the results of eagerly executed tasks will be saved to the backend.
and all I need in my Django test settings is
CELERY_TASK_ALWAYS_EAGER = True
CELERY_TASK_STORE_EAGER_RESULT = True