A mapper pattern for PHP

The “mapper” pattern allows us to either:

This sounds very much like straight-forward serialisation, but there are some key differences.

  1. It is not necessarily two-way
    It is possible to map from an object graph to a representation that does not contain all the information to reconstruct an object graph. An example might be a case where we map a user’s “screen name” out of the user object; a useful piece of information, but not enough to construct a fully-formed user object on its own.
  2. The representation is flexible
    It is possible to write mappers that map to and from various different representation formats (eg: JSON, XML, Cassandra Mutation Map). PHP’s in-built serialisation has a fixed end-result, determined by the PHP engine itself.
  3. There are a number of added-extras
    These include in-built caching, the ability to override values and provide defaults. More on this later.

How they fit into the overall architecture

I’m a huge fan of Domain Driven Design. When implementing a new set of functionality, I usually start with CRC cards, formulate a design for the domain objects and then start on prototyping in conjunction with unit testing. Aside – I generally don’t practice Test Driven Development, although I will hopefully try it out at some point.

The mappers come into play when creating representations of domain objects or creating an object graph from representations.

How mappers can be used in overall architecture

How mappers can be used in overall architecture

Web Service

RESTful Web Service GET requests can be created by mapping domain object graphs to representations (JSON, XML, HTML). The schema (more on this later) allows intricate control over exactly which bits of the domain object graph are mapped, allowing for a variety of different representations of the same domain objects. I’m ignoring HATEOS for the purposes of this post, this can easily be added.

RESTful Web Service POST and PUT requests can be handled by mapping the POSTed or PUT representations into a domain object graph and then saving these via the persistence layer.

Persistence

Retrieving domain objects from a persistence service (loading) can be facilitated by the representation to object mapper. One example of a specific implementation is a Cassandra Mutation Map (a representation) to object mapper.

Persisting domain objects (saving) can be facilitated by the object to representation mapper, for example mapping to a Cassandra Mutation Map or SQL.

Building blocks of the domain layer

There are many ways that this mapper pattern could be implemented. The implementation I have created relies on a number of consistent design principles within the domain layer. The important building blocks are outlined below.

Keyed objects, value objects, collection objects

We can construct our domain object based around three core object types. Any concrete domain objects therefore share functionality of one of these types and implement a common interface.

  • Keyed objects
    Domain objects that have a uniquely identifiable key – globally unique within the application. These objects are stored in an Identity Map to make sure only one instance for each unique key value is ever created.
  • Value objects
    Domain objects that do not have a uniquely identifiable key. These objects cannot be stored in an Identity Map, nor would it make sense to do so.
  • Collections
    Domain objects that represent a list of other objects. These, at their most basic level, implement the Iterator interface. Each different type of domain object will have its own accompanying collection object.

Virtual-proxy pattern for lazy-loading

All keyed domain objects can be replaced with a virtual-proxy. This behaves like the original object (implements the same interface) but only loads itself from the database at the last minute when needed.

Getters

All objects have a consistently named “getter” defined. I used to think this was a bad idea but I have since mellowed in my opinion (setters are still evil).

Object to representation mapper

The aim here is to turn an object graph into a representation. An example:

class user {
    public function getUsername();
    public function getName();
    public function getRegisteredTimestamp();
}

We then map this to a JSON representation using the schema:

$schema = array(
    'username',
    'name',
    'registeredTimestamp'
    );

Resulting in:

{"username":"davegardnerisme","name":"Dave Gardner","registeredTimestamp":1284850800}

Schemas

Object graphs nearly always have great complexity, often involving circular relationships. When mapping to representations we often don’t want all this complexity, nor would it be feasible to include it all! With a rich domain layer built up from many contained lazy-loading objects, if you continue to dig down into the object graph you could end up loading every single object that exists within your application!

This is why we use schemas when mapping from objects to representations – we need to choose what to actually map. What we are really doing it identifying how far to dip into the object graph when formulating the representation.

How it works

The implementation of the object-to-representation concept has a number of key elements:

  1. Object graph walker
    Aware of how to walk through the object graph, according to the given schema, drilling into any contained objects and collections.
  2. Object property extractor
    Able to extract a property from an object according to a schema.
  3. Property to value convertor
    Able to turn a retrieved property into a scalar value (string/integer/float).

Pseudo code

We run this code passing in the object to map from plus the schema. The code is structured to be run recursively, building the output array (passed by reference) as it goes.

foreach (entry in schema)
{
    if (schema entry indicates we should map to a list)
    {
        assert (we have a list)
        foreach (item in list)
        {
            call this recursively with this item and the sub-selection of schema
        }
    }
    else
    {
        extract property of object according to the schema
        convert this property to a scalar value
        add this property to the mapped-to data
    }
}

Caching

One interesting thing about the object to representation cache is that you can add in a caching layer which avoids, in many cases, the need to actually carry out the mapping. This is particularly effective when complex object graphs built from immutable objects. The reason is that we don’t actually need to load the objects themselves from the database; we can use the virtual-proxy key to give us a cache key and then simply load the representation directly. With PHP it’s usually a good idea to use APC to cache representations (over Memcached) to avoid the 1MB limit. This makes it more effective when mapping/caching large object graphs to large representations.

A caching system can be added into the mapping code:

if (object to map is a keyed object)
{
    cache key = hash on (schema + object key)
    if (!exists in cache)
    {
        map as normal
    }
    else
    {
        return the cached representation
    }
}

Object verification

The domain layer is built upon the principle of lazy-loading, making use of the virtual-proxy pattern. These virtual-proxies will initialise a concrete object when they need to. Remember that a virtual-proxy has a property that indicates some kind of unique identifier for the object, and it knows how to load itself. Therefore when mapping, if we need to turn a virtual-proxy object into a single value, we don’t need to load it, since we already have the unique identifier. This is a nice optimisation. However, this is not always desirable.

Sometimes you want to guarantee that objects exist, by forcing any objects touched via the mapper to be loaded from the database. This is where the verification feature comes in.

Overrides

We can tweak the final representation by adding overrides. These are a way of saying “please ignore any value that could be extracted from the object and use this object/callback instead”. One interesting way I have used these is to attach URL properties to domain objects. URLs are a property that does not usually belong in the problem domain, but rather are concerned with a specific representation scheme – HTTP. Therefore the overrides can be added within the web service layer to allow us to map a URL for a domain object within a specific context. One interesting point to note, required by the URL example, is that the override doesn’t actually have to override anything. The original object does not necessarily have to have this property in the first place.

Example

All these examples use the domain objects from my GlastoFinder project. You can check out the interfaces for these in the last section of this post.

Mapping the list of places to JSON representation

$schema = array(
    service_mapper::LOOP_ITEMS => array(
        'key',
        'title',
        'category' => array(
            'key',
            'title'
            ),
        'location' => array(
            'latitude',
            'longitude'
            ),
        'icon',
        'hashTag',
        'details'
        )
    );
$mapper = $this->diContainer->getInstance(
        'service_mapper_objectToArray',
        $schema
        );
$json = $mapper->map($list);

Sample output:

[
    {
        "key": "0f45b80d-96d5-546d-bade-c2e583489783",
        "title": "Poetry & Words",
        "category": {
            "key": "other_venues",
            "title": "Other Venues"
        },
        "location": {
            "latitude": 51.149364968572,
            "longitude": -2.5799948897032
        },
        "icon": "/i/otherstage-sm.png",
        "hashTag": "poetryandwords",
        "details": null
    },
    {
        "key": "10b11ab0-e164-5ac7-8ea4-ea5670cdf54e",
        "title": "Pedestrian Gate E",
        "category": {
            "key": "gates",
            "title": "Gates"
        },
        "location": {
            "latitude": 51.147641084707,
            "longitude": -2.6001275707455
        },
        "icon": "/i/gate-sm.png",
        "hashTag": "gatee",
        "details": null
    }
]

Representation to object mapper

This is the complete opposite of the object to representation mapper. We take some representation and then turn this into an object graph; potentially constructed of many related objects. An example:

Representation:

{"username":"davegardnerisme","name":"Dave Gardner","registeredTimestamp":1284850800}

Will be able to yield us a user object with the following properties:

class user {
/**
 * Constructor
 *
 * @param string $username The user's username - an alphanumeric (a-zA-Z0-9) string, unique
 * @param string $name The user's full name, forename plus surname, or however they want to name themselves
 * @param integer $registeredTimestamp The date this user was registered
 */
public function __construct(
    $username,
    $name,
    $registeredTimestamp
    )
}

The mapper, when asked to construct a user object, will examine the values needed by the object (by looking at its constructor) and then extract these properties from the representation. Unlike the object to representation mapper, no schema is needed. The schema is inherent in the object definitions themselves.

How it works

The implementation of the representation-to-object concept has a number of key elements:

  1. Code analysis tool
    Ideally a static analysis tool to avoid the cost of reflection. This would know what parameters are needed to construct each domain object.
  2. Recursive object-building code
    Able to determine what type of thing to build (depends on what value available in the representation) and then build it.

Pseudo code

We run this code passing in the object to map from plus the schema. The code is structured to be run recursively, building the output array (passed by reference) as it goes.

decide which value to use by looking at representation, defaults and overrides
if (value to use is a scalar [string, int, float])
{
    build placeholder domain object
}
else
{
    build full domain object
}

if (thing to build is list)
{
    foreach (item within representation)
    {
        call function recursively to build item, then add to list
    }
}

verify all built objects, if required

Overrides and defaults

When we map from a representation to an object, it’s often useful to be able to tweak the process, supplying default values where necessary and overrides in certain situations. To understand a use-case, it’s important to understand the principle of “fully formed objects”. This means that whenever we create an object, we always supply every piece of information to the object constructor, meaning that if ever we have an object instance, we know that all the data is present and correct. So for example a user object may have a createdTimestamp; this should be supplied, as a valid value, whenever we construct the object. To adhere to this, we could not pass in a NULL value and leave it to the object to provide a default value (eg: now). Instead we should use a Factory for this, or we could use a mapper default!

If we had a web service end point that created a new user, we may require a representation (eg: JSON) to be PUT. However what about the createdTimestamp? Should this be in the PUT representation? My thinking is that no, it shouldn’t. Instead we use the mapper override feature to force the createdTimestamp to be exactly the time that the user was created, according to the server processing the request.

The following illustrates defining an override callback to force a createdTimestamp to be defined at time of mapping. This makes use of PHP 5.3 anonymous functions.

$mapper = $diContainer->getInstance('service_mapper_jsonToObject');
$mapper->addOverride(
    array('createdTimestamp'),
    function () { return time(); }
    );

Rules for building domain objects

Placeholders (virtual-proxy) vs full domain objects

The mapping algorithm ultimately boils down to a situation where we need to build some object, based on some value. This value could be a scalar (string, integer, float) or an array. We need some rules to determine how we go about building a domain object based on the situation we find, specifically what type of value we have.

  • Scalar – string, integer or float
    When we are asked to build a domain object and we find a scalar in the representation, we make the assumption that this value refers to some unique identifier for the object and therefore build a placeholder object (virtual-proxy) instead.
  • Associative array
    When we are asked to build a domain object and we are presented with an array of values, we make the assumption that all of the necessary constructor properties are present in the representation. We then dig through these values and match them up with constructor arguments, before finally constructing the fully-formed domain object.

Collections – recursion

Whenever we are building a collection object, we look for a numerically-indexed array of items within the representation. We then call the “turn a value into a domain object” method recursively to yield us objects to push onto our list. The assumptions here are that collections are always represented as an Iterable object within the domain layer.

Verification

When we carry out the mapping, we may create any number of placeholder (virtual-proxy) objects in place of real domain objects. These will then be lazy-loaded on-demand. This is all fine and good, but not if you want to ensure that all the objects are valid. Luckily it is trivial for the mapper to keep track of any placeholders it mints during its mapping job. With the verification flag set, this list of of placeholders can then be force-loaded to ensure that they actually exist. Whether or not this is desirable depends on the situation; when accepting PUT or POSTed representations via a web service the safety net is useful, when mapping from a database representation we often don’t want the overhead.

GlastoFinder domain object reference

Location interface

interface domain_location_interface
{
    /**
     * Get latitude
     *
     * @return float
     */
    public function getLatitude();

    /**
     * Get longitude
     *
     * @return float
     */
    public function getLongitude();

    /**
     * Get distance to another location
     *
     * Uses "haversine" formula
     * @see http://www.movable-type.co.uk/scripts/latlong.html
     *
     * @param mz_domain_location $otherLocation The other location
     *
     * @return float The distance measured in kilometres
     */
    public function getDistanceTo(mz_domain_location $otherLocation);
}

Place interface

interface domain_place_interface
        extends     domain_keyed_interface,
                    domain_hasLocation_interface
{
    /**
     * Get title
     *
     * @return string
     */
    public function getTitle();

    /**
     * Get category
     *
     * @return domain_place_category_interface
     */
    public function getCategory();

    /**
     * Get icon
     *
     * @return string
     */
    public function getIcon();

    /**
     * Get hash tag
     *
     * @return domain_hashTag_interface
     */
    public function getHashTag();

    /**
     * Get details
     *
     * @return string
     */
    public function getDetails();
}

Tags: , , , , , ,

3 Responses to “A mapper pattern for PHP”

  1. Hello Dave,

    1. “I used to think this was a bad idea but I have since mellowed in my opinion (setters are still evil).”
    May i ask why?

    2. “Virtual-proxy pattern for lazy-loading”
    Can you please provide an example for this?
    I wonder why i would instantiate an domain object witch is never used? (wrong lifetime in objectgraph?)

    3. “Keyed objects”
    Why you call DDD Entities “Keyed objects”?

    4. “Aggregates”
    Where you put domain logic witch operates on data and its “metadata”?
    eg: user->rename()
    tables: user_name, user_change_history

    Thank you

  2. Dave says:

    1. Why have I mellowed? Because it means you can have a mapper pattern, where the mapper chooses which bits of information it wants to extract from the object. I tried an alternative once, which gave responsibility to the object for “exporting” it’s information to some container. This had a major drawback that by having the responsibility, it could not choose what to export and what not to. I am going to be blogging about this soon.
    2. There is an example of this within this post on Interfaces. In terms of “why”, the reason being that in a complex system, every object will probably be connected to every other object. To carry out some simple feature of a site (display a welcome message to a user, for example), you almost certainly don’t want to load every object from your database. Hence the lazy-load pattern.
    3. No reason. They are objects with keys, so I called them keyed objects. Google suggests that DDD entities are exactly the same concept.
    4. I would make the object immutable and use a builder pattern that ultimately yields a brand new object in a situation where you want to change details within the object. You would then ask the Data Access Object to persist your new user; it would then be responsible (if required) to update any “user change history” relating to that object. I am probably going to be blogging about this soon as well.

    Hope that helps.

  3. Pau says:

    Well done David … I’ve learnt a bit more about mappers. I shall try my own implementation and then compare it with yours ;) !

    I may digg into it at some point if I have time, thanks for sharing!

Leave a Reply