Functional tests, and speeding up the schema creation

Posted on by Matthias Noback

When Symfony2 was created I first learned about the functional test, which is an interesting type of test where everything about your application is as real as possible. Just like with an integration or end-to-end test. One big difference: the test runner exercises the application's front controller programmatically, instead of through a web server. This means that the input for the test is a Request object, not an actual HTTP message.

Drawbacks of functional tests

This approach has several drawbacks, like:

  1. Faking a Request object means you don't know if the code also works in a production environment, when an actual HTTP request will be provided as input.
  2. Inspecting a Response object instead of a real HTTP response message comes with the same risk, but my guess is that it's a less severe risk.
  3. Since you have access to the application's service container you may be tempted to open the black box and replace all kinds of services, or otherwise influence them. I recommend doing no such thing.
  4. You are likely to end up testing all kinds of domain-related behaviors through thick layers of unrelated code. This obscures the view on the real issues and makes it hard to figure out what's wrong, if anything pops up. Combining this with drawback 3 you often find yourself rummaging around in all the stuff that's on the inside of your application, lengthening the feedback loop between building something and finding out if it works. Because these tests end up being white-box tests after all, you also end up with many cracks; it will be easy for anything to fall through them and you'll only find out if that happened when the code is already running in production.

Another issue I see with functional tests is that the assertions that are made are often something like: "I see in database". This too is like opening the box and looking inside. I'd like to follow this reasoning instead: if, as a user of the system, I expect a request to produce some kind of effect, then I should be able to notice this effect as a user. And as a user I can't look directly inside the database. There must be some other aspect of the system that was changed when I made that request. E.g. after registration I can now log in, or the product I added to my basket actually shows up on the "my shopping basket" page.

In some rare cases the desired effect can't be observed by a user. For instance, when you expect a certain message to be published to a queue. Well, maybe a functional test isn't the right place for such a check after all, and you can prove that the message will be published using a combination of other, smaller tests. But if you still want to do it, you'd have to do a quick peek into the box after all. Sometimes that's just how it is. But always look for ways to prevent this, and let your application remain the black box that you poke at with HTTP requests only, making only assertions about the response you receive.

Having discussed many of the downsides now, let's not forget the good parts: we don't have to deal with a web server or run all those slow browser tests without getting a "90% okay" approval from our tests. We can accept the "risk" and work with functional tests instead of true end-to-end tests which force the test to send an actual HTTP request to an actual web server and wait for an actual HTTP response. We can benefit from some options for peeking-in-the-box. If we can keep the number of times we do this to the absolute minimum, we will end up with high-quality tests that don't fail for just any random reason, like concurrency issues.

Bringing the database in the right state

One thing that turns out to be quite hard when writing functional tests is getting the database in the correct state for running the tests. In fact, we have to get it in the right state before each test, so that tests can't influence each other when they use the same database tables. Ideally I'd always use the actual steps that a user would take to get that data into the database, instead of loading data directly into the database using some kind of fixture generation tool. But for some data, this is just impossible. That data needs to be there from the start. Sometimes it's because how the application has been designed; or maybe nobody realized there was a problem until they started writing tests (we call it a "legacy application" then ;)). Sometimes it's a hint that this data-that-is-always-there should not be in the database but in the code instead, since the code is by definition always-there. See also About fixtures.

Before running a functional test you have to get the database into the correct state by running database migrations in the application's startup script. There's one requirement for that: starting with an empty schema, running these migrations will result in a database schema that mimics the one used in production. This will not be the case in some projects (like the one I'm working on right now), so you may need to introduce a "base migration" first, and then optionally run some migrations to reflect recent changes to the schema as well.

In my current project our problem was that setting up the schema before every test was a rather slow thing to do. In the beginning it wasn't too bad, but hundreds of functional tests later, we started to notice. For some of us running these tests took 5 minutes. This is a great window of opportunity to loose the interest in programming altogether and resort to opening a browser tab with your favorite social media site.

In our functional test suite we switch from MSSQL (which runs in production) to Sqlite, which is itself risky since there are so many differences. But so far we didn't figure out a nice, cheap, and fast way to run MSSQL on our local development machines (if you have a solution, please let me know!). We run Sqlite in "in-memory" mode so it runs well, but since the database gets dropped after every test we have to recreate the schema for every test as well, which is the slow part.

Speeding up schema creation

I was thinking about a trick that is used in Liip's FunctionalTestBundle where they run Sqlite with a database file on disk. For the first test they would set up the database, run all the migrations, and create a snapshot of the database file. Before running the next test the snapshot would be copied to the location of the actual database. This saves us from running the migrations before every test. But it does force us to stop using Sqlite in-memory mode and switch to the file mode. Unfortunately, file manipulations are quite slow, in particular if you do them on a bind-mounted Docker volume on Mac or Windows. So this certainly wasn't a good option. Until I remembered another trick that was used back in the days (and maybe still), that is to store the database file on the temporary file system, which is an in-memory filesystem. It behaves just like a regular file system, but the manipulations on it will be very fast.

When using Symfony, it's even an option to put your whole cache directory in shared memory, and then configure Doctrine DBAL (PDO) to store the database inside that cache directory.

// In AppKernel.php

final class AppKernel extends Kernel
{
    // ...

    public function getCacheDir(): string
    {
        if ($this->getEnvironment() === 'test') {
            return '/dev/shm/' . $this->getEnvironment();
        }

        return parent::getCacheDir();
    }
}

When using Docker, you should probably increase the allowed size of this in-memory filesystem:

services:
    php:
        # ...
        shm_size: 2GB

In Symfony's test configuration you should then configure the Sqlite database as follows:

parameters:
    sqlite_db_path: "%kernel.cache_dir%/db.sqlite"
    sqlite_db_image_path: "%kernel.cache_dir%/db_image.sqlite"

doctrine:
    // ...
    dbal:
        driver: pdo_sqlite
        memory: false
        path: "%sqlite_db_path%"

Finally, you have to hook into your test runner to create the database image/snapshot if it doesn't exist yet, and to copy it to the expected location before every test. When you use Codeception with the Symfony module, this would work:

namespace Codeception\Module;

use Codeception\Module;
use Codeception\Module\Symfony;
use Codeception\TestInterface;
use Doctrine\DBAL\Connection;
use Symfony\Component\DependencyInjection\ContainerInterface;

final class DatabaseFixtures extends Module
{
    public function _before(TestInterface $test): void
    {
        // Before every test we set up the database

        self::setUpDatabase($this->getSymfonyModule()->_getContainer());
    }

    public function _after(TestInterface $test): void
    {
        self::tearDownDatabase($this->getSymfonyModule()->_getContainer());
    }

    private function getSymfonyModule(): Symfony
    {
        return $this->getModule('Symfony');
    }

    private static function setUpDatabase(ContainerInterface $container): void
    {
        $sqliteDbPath = $container->getParameter('sqlite_db_path');
        $sqliteDbImagePath = $container->getParameter('sqlite_db_image_path');

        if (!file_exists($sqliteDbImagePath)) {
            /*
             * A database "image" file does not exist yet. We will use
             * the regular connection to set up an empty database with a
             * complete schema first. Then we'll copy the resulting database
             * file and keep it as an image which we can use during subsequent
             * tests.
             */

            /** @var Connection $sqliteConnection */
            $sqliteConnection = $container->get('doctrine.dbal.default_connection');

            // Set up the schema manually:
            $sqliteConnection->query(/* ... */);
            // (Or better yet: run all migrations)

            // Then copy the resulting DB file to a different location
            copy($sqliteDbPath, $sqliteDbImagePath);
        }
        else {
            /*
             * A database image file already exists containing no data but including
             * the full schema. We copy it to the place where our database connection
             * expects to find it.
             */
            copy($sqliteDbImagePath, $sqliteDbPath);
        }
    }

    private static function tearDownDatabase(ContainerInterface $container): void
    {
        unlink($container->getParameter('sqlite_db_path'));
    }
}

Finally, you'd have to enable this Codeception "module", for instance in functional.suite.yml:

class_name: FunctionalTester
modules:
    enabled:
        # ...
        - DatabaseFixtures

I hope any of this will be useful to you in some way.

P.S. Currently, the only problem I've discovered is that when you put the entire cache directory in /dev/shm, the Symfony container has to be rebuilt for every test run. This usually happens very quickly, and writing to disk will be very fast too, but not if you also enable XDebug. So if you want to step-debug a test, it may take many seconds before you can do that. One solution would be to store only the database in /dev/shm but not the entire Symfony cache.

PHP testing functional testing fixtures Symfony
Comments
This website uses MailComments: you can send your comments to this post by email. Read more about MailComments, including suggestions for writing your comments (in HTML or Markdown).
Oleksandr Dombrovskyi
Why can’t you just prepare the database and then before each test start transaction and after test roll it back? In this way your tests are times faster and you don’t need to worry about clean database. 
Filippo Tessarotto

Hi, we work with MySQL 5.7 and we managed to get impressive performance boosts without tricky configuration (tricky means hard to maintain across platforms and time) with the following steps:

  1. tmpfs: /var/lib/mysql for MySQL in the docker-compose.yml; no copy-paste capabilities like /dev/shm, but still all MySQL in RAM
  2. Raw SQL dump file as database populator piped to a raw MySQL binary (native or https://github.com/Slamdunk/mysql-php); this seems fool, but we've found that the overhead of any PHP abstraction layer over database population is a huge. /dev/shm copy-paste is better, but cat dump.sql|mysql is easier to port and maintain
  3. Codeception+Robo parallel executions: https://codeception.com/docs/12-ParallelExecution very easy and straightforward to get in place both locally and in CI

We got one project from 15 minutes to 3, and another one from 8 minutes to 1.

Filippo

PS: it would be nice to have your tweet https://twitter.com/matthiasnoback/status/1242118308679299086 pinpointed here so everyone can read the interesting replies to it :)

Piero Recchia
Hi Matthias, we created a listener in phpunit extending BeforeFirstTestHook, in class run doctrine migrations but after check is there is a new migration, if there's no new migration avoid migration execution, we also use DAMA\DoctrineTestBundle\PHPUnit\PHPUnitExtension that run all database operation in a transaction and execute a roll back after every test, that way the initial state is remaining, and use factory-muffin to create on the fly the objects o data needed for every test, i hope it can help you, if you need more detail, let me know.