Caching dependency-injected objects

This blog posts talks about caching and retrieving objects in PHP (eg: via Memcache) where the objects themselves have a number of injected dependencies. It includes using the PHP magic methods __sleep and __wakeup to manage serialisation. It also discusses mechanisms for re-injecting dependencies on wakeup via a method that maintains Inversion of Control (IoC).

This post covers:

Sample system

To illustrate the idea, I’ll use a simple domain model where we have a userList object (iterator) containing a number of user objects. Each user has an injected userDao dependency which is used for lazy-loading usageHistory, on request.

class userList
{
    public function current() { }

    public function key() { }

    public function next() { }

    public function rewind() { }

    public function valid() { }

    public function count() { }
}

class user
{
    private $usageHistory;

    public function __construct($dao, $userDataRow)
    {
        $this->dao = $dao;
        $this->usageHistory = NULL;
    }

    public function getUsageHistory()
    {
        if ($this->usageHistory === NULL)
        {
            $this->usageHistory = $this->dao->lazyLoadHistory($this);
        }
        return $this->usageHistory;
    }
}

class userDao
{
    public function __construct($database, $cache, $logger)

    public function getList() { }

    public function lazyLoadHistory() { }
}

class usageHistory
{
}

Dependency Injection

A sample invocation of this simple system might be to ask the DAO for a user list object. To create a DAO object we will almost certainly need to pass in a bunch of dependencies such as database services, caching services and logging services.

I’m using a DI container to create objects. To get a really quick idea of what these are about you can imagine doing this:

$diContainer = new diContainer();
$userDao = $diContainer->getInstance('userDao');

Instead of this:

$configuration = new systemConfig();

$database = new mysqlDatabaseConnection($configuration);
$cache = new memcacheConnection($configuration);
$logger = new firebugLogger();

$userDao = new userDao($database, $cache, $logger);
$userList = $userDao->getList();

The key idea is that the DI container will build the object graph for you. For each dependency needed it will go away and fetch that, building any other dependencies of those objects and so on recursively up the tree.

I’m using an annotation system to power my own DI container; making the whole process simple and configuration-light.

Putting objects to sleep

Caching is a very handy tool to improve the performance of applications. Storing objects in a cache (for example Memcache) prevents us having to go to database each time. Memcache is a very simple system; a key-value store. You give it some data (less than 1MB) and it stores it for you until you ask for it again. Storing objects is slightly more complex than simple strings; with objects you need to serialise them. Memcache actually does this for you (you don’t need to call serialize() first).

However caching objects can be problematic. Whenever you start to really use the power of OOP you inevitably end up with complex object graphs. Our user object, for example, contains a userDao object. This in turn contains a database service object, a cache service object and a logging service object. Some of these objects have their own dependencies! For example the database service object contains a configuration object.

The key point here is that by default, when we serialise a user object we will be serialising all the internal properties, including all the dependencies. This is undesirable.

This is where PHP’s built-in magic __sleep method comes to the rescue. Using __sleep we can tell PHP what we do want to store. Let’s assume our user object has the following properties:

class user
{
    private $dao;
    private $name;
    private $age;
    private $email;
    private $phoneNumber;
    private $usageHistory;
}

What we’ll do is tell PHP what we want to save.

class user
{
    public function __sleep()
    {
        return array('name', 'age', 'email', 'phoneNumber');
    }
}

Now we can serialise and/or cache objects without the overhead of complex dependency graphs.

Waking objects up

When it comes to restoring objects, for example via Memcache::get or via unserialize(), we will end up with a user object that has a valid name, age, email and phoneNumber property. What we won’t have is the DAO dependency or the usageHistory property. It is important to realise that the class constructor will not be called when the object is unserialised.

For pure simplicity we can use PHP’s built-in magic __wakeup method to execute code on unserialisation.

class user
{
    public function __wakeup()
    {
        $this->usageHistory = NULL;
        $diContainer = new diContainer();
        $userDao = $diContainer->getInstance('userDao');
    }
}

This is handy for ensuring that the usageHistory property is properly set to NULL (so it will lazy-load). The problem with this approach is that we lose the Inversion of Control. Instead of injecting the dependencies, we are instead looking them up; we have a tightly coupled dependency to the DI container. One of the key points of DI is that the objects themselves shouldn’t really know or care about the DI container.

When constructing objects using the DI container we never directly use the “new” keyword to create objects – instead we rely on the DI container to do this for us. This supplies all dependencies as parameters. However we can’t replace the call to __wakeup; and therefore we can’t inject dependencies here.

Restoring dependencies

To ensure that dependencies are restored correctly I use a “magic” method __restoreDependencies. Ok so it’s not actually that magic; PHP doesn’t call it automatically! However the serialisation/unserialisation in my application is localised within my cache object. Therefore what I can do is adjust my cache::get method:

class cache
{
    public function get($key)
    {
        $value = $this->memcache->get($key);
        if (is_object($value) && $value instanceof cacheable)
        {
            $this->diContainer->wakeup($value);
        }
    }
}

To make life easy I actually use a “cacheable” interface that objects must implement in order to be stored in cache. This formality really just ensures that no one tries to cache objects without making sure they think of the implications on dependencies. The cacheable interface simply ensures that an object has a __restoreDependencies() method.

The (bespoke) DI container has a “wakeup” method that will:

1. Call the __restoreDependencies() method injecting any required services (dependency objects)

2. If the __restoreDependencies() method returns an array of other objects, call the wakeup() method on those objects as well. This can repeat recursively if required.

The second point here ensures that we can cache an entire userList object and wake it up effectively. The userList object’s __restoreDependencies() method would return an array of all user objects that need waking up.

The result is that I can cache complex object graphs without dependencies, but have these dependencies automatically “fixed” when objects are retrieved from cache. The objects themselves don’t really know anything about the process. Instead all they need to do is define a simple interface which defines the required dependencies.

Tags: , , , , , ,

6 Responses to “Caching dependency-injected objects”

  1. Maybe using setter injection instead of constructor injection you can reuse the same interface for a) injection of collaborators and b) waking up. I mean substituting the __construct() parameters with a few initXXX() methods which accept an object if the correspondent field property is not already set. After de-serialization the properties are null so the init*() methods work again.
    Anyway, you shouldn’t use methods that start with __. :) They are reserved for future use and convey a bad semantic (no magic behavior).

  2. Really interesting!

    Thanks for sharing!

  3. Mark Baker says:

    An excellent article
    I’ve been working on something similar myself, using the observer pattern with my cacheable objects to trigger the caching whenever they are modified in any way. Surprisingly, I found that it was more efficient in terms of execution time to call a detach() method to set the observer property to null rather than use the __sleep() magic method. Perhaps it was the overhead of needing to return an array of the properties that I did want serializing, and it might vary from object to object.
    While I haven’t had any problems with caching to APC, memcache, a disk file, or php://temp; I’ve found that caching the serialized objects to a simple array is giving me intermittent memory leakage. Despite setting the observer property of the objects to null before serializing, I’m still getting orphaned instances left in the memory, as I would if the the observer property refcount wasn’t yet 0. I’m wondering if the garbage clearance hasn’t kicking in quickly enough to clear the cyclic reference.

  4. Dave says:

    Some interesting points Giorgio. I accept that I shouldn’t use __!

    Using setter rather than construction injection may be a good way to go regarding reusing the interface. My __restoreDependencies method could be simply setDependencies to make things simpler.

    However the problem would still remain that the injection would not be performed automatically after unserialize().

  5. Dave says:

    Interesting idea regarding the use of the Observer pattern, Mark. I must admit I haven’t yet got into any optimisations. I will have to get a profiler on it at some point.

  6. Luka Peharda says:

    Intresting stuff. Thank you

Leave a Reply