Duck-typing in PHP
Matthias Noback
For quite some time now the PHP community has becoming more and more professional. “More professional” in part means that we use more types in our PHP code. Though it took years to introduce more or less decent types in the programming language itself, it took some more time to really appreciate the fact that by adding parameter and return types to our code, we can verify its correctness in better ways than we could before. And although all the type checks still happen at runtime, it feels as if those type checks already happen at compile time, because our editor validates most of our code before actually running it.
To make it perfectly clear: this is all very awesome. In fact, I hope that PHP will change to become more of a static language than a dynamic one. I can very well remember the times when we actually relied on PHP doing the type juggling for us, but I’m happy we’ve left that phase behind. I think that nowadays many PHP developers agree that silent type conversions is not something which is very useful, nor safe.
But sometimes it’s good to remember what’s possible with PHP, due to it being a dynamic scripting language. I recently encountered a situation where I wanted to build a generic repository, which would be able to keep track of entities, allowing the user to store and retrieve them by their ID.
class SomeEntity
{
public function id()
{
return $this->id;
}
}
class GenericRepository
{
public function store($object)
{
$id = $object->id();
...
}
public function getById($id)
{
return ...;
}
}
So, what are the types we should introduce in this scenario? $id
might be a simple string
, although these days identifier strings will often get wrapped in their own dedicated value object. Maybe we could enforce an interface for Id
type of objects? But then people won’t be able to use a simple string
anymore. Do I want to force that upon them? The same goes for the objects that our repository is going to store. $object
might be typed as an Entity
interface (since an object with identity is basically what we call “entity”), which has a method id()
, which returns an identifier:
interface Id
{
public function __toString() : string;
}
interface Entity
{
public function id() : Id
}
Do we want to force the term Entity
onto the user’s code? Do we want to force users to implement the Id
interface? What if there is no user we can force? What if the “entity” we want to store in our repository is defined in a third-party library?
It doesn’t have to be that way. Hey, it’s PHP! We only want the user to provide an object which we can use in the following way:
public function store($object) {
$id = $object->id();
/*
* $id should be a string, or usable as a string (i.e. it has a __toString() method)
*
* In fact, we might as well just cast it to a string to be sure:
*/
$id = (string) $id;
...
}
The funny thing is, whatever value the user provides, we can already do this. As long as the method id()
exists on the object and PHP can successfully cast its return value to a string, we’re fine. As long as we don’t define any type at all for the $object
parameter, PHP will do no type checking and will just try to do whatever you ask it to do, and throw warnings/errors/exceptions whenever it fails.
The only problem, one that many of us including myself will find a very big problem: our IDE isn’t able to help us anymore. It won’t be able to verify that methods exist or that passed function argument types are correct. It won’t let us click to class definitions, etc. In other words, we loose our ability to do a little bit of the type-checking before runtime.
How to fix this? By helping your IDE to figure it out. PhpStorm for example allows you to define @var
or @param
annotations to make intended types explicit.
public function store($object) {
/** @var Entity $object */
...
}
// or (this might show some IDE warnings in the user's code):
/**
* @param Entity $object
*/
public function store($object) {
...
}
So, even when $object
doesn’t actually implement, it will still be treated by the IDE as if it does.
This, by the way, is known as duck typing. Type checks happen at runtime:
With normal typing, suitability is assumed to be determined by an object’s type only. In duck typing, an object’s suitability is determined by the presence of certain methods and properties (with appropriate meaning), rather than the actual type of the object.
Introducing the php-duck-typing library
The only problem of simply adding a type hint to a value like this is that PHP will simply crash at some point if the passed value doesn’t meet our expectations. When we call store()
with an object that doesn’t really match with the Entity
interface, we would like to give the user some insight into what might be wrong. We’d like to know what was wrong about the object we passed to store()
, e.g.:
- The object doesn’t implement the
Entity
interface. - It does offer the method
id()
. id()
doesn’t return an object with a__toString()
method though.
In other words: we need some proper validation!
Let me introduce you to my new, highly experimental open source library: php-duck-typing. It allows you to run checks like this:
public function store($object) {
// this will throw an exception if the object is not usable as Entity:
Object($object)->shouldBeUsableAs(Entity::class);
...
}
Just wanted to let you know that this exists. I had some fun exploring the options. Some open issues:
- Could an object with a
__toString()
method be used as an actualstring
value? - What about defining other types which we can use as pseudo-types, e.g. arrays as traversables, arrays as maps, etc.?
I’d be interested to hear your thoughts about this.
For now, this library at least supports the use case I described in this article. I’m not sure if it has a real future, to be honest. Consider it an experiment.