PHP: Setting up a Stream Wrapper for Manipulating the DOM

Posted on by Matthias Noback

I recently felt a strong urge to write something about implementing a stream wrapper for PHP. A stream wrapper is a way to handle file interaction of any kind. PHP has built-in stream wrappers for HTTP, FTP, the filesystem, etc. But you are also allowed to implement custom protocols using your own stream wrapper. Stream wrappers are used by all file functions, like fopen() and fgets(). Creating a custom stream wrapper begins with creating a class (it does not have to extend anything) and then making a call to stream_wrapper_register. In this post (and future posts on the same subject) I will develop a stream wrapper for manipulating a DOMNode's value using traditional file manipulation functions. The stream wrapper class is called "DOMStreamWrapper" and we register it for the protocol "dom":

class DOMStreamWrapper
{
}

stream_wrapper_register('dom', 'DOMStreamWrapper');

We should now in theory been able to open the DOM by providing a URI. The URI should point to the DOM node whose value we wish to modify:

$handle = fopen('dom://versions/versions/latestRelease', 'r');

The host of the URI is meant to be a file name (without extension), the path will be used as an XPath query for retrieving the DOM node you want to manipulate. Using the above URI will select the "latestRelease" node in versions.xml:

<?xml version="1.0" encoding="UTF-8"?>
<versions>
    <latestRelease>1.3</latestRelease>
</versions>

Beware: this whole stream wrapper is just an experiment - I don't think using it will make your life any better. Just take it as an example.

You will find the full code of the DOMStreamWrapper in my Github account, but I will discuss a few of my findings in this post, as well as highlight some parts of the code.

Opening a stream

The first thing a stream wrapper should provide is a stream_open() method. This method receives a path ("dom://versions/versions/latestRelease"), a mode ("r", "r+", "w", etc. - see the fopen() documentation), and optionally some option flags and a reference to a variable in which you can set the path that you really opened (for example, if the given path is somehow an alias for something else).

This method should return true if opening the stream was successful, or false otherwise. If anything went wrong an error should be triggered. But only if the option flags say so. This means that you can only internally use exceptions, but catch them in the end, trigger an error and return a boolean value. stream_open() in general would look like this:

class DOMStreamWrapper
{
    public function stream_open($path, $mode, $options, &$opened_path)
    {
        try {
            // open the stream

            // possibly throw an exception

            return true;
        } catch (\Exception $e) {
            if ($options & STREAM_REPORT_ERRORS) {
                trigger_error($e->getMessage(), E_USER_WARNING);
            }

            return false;
        }
    }
}

N.B. I found that the $mode argument will an empty string if the user provides no mode himself, even if you define a default value for this argument.

Validating the path

It's important to validate the given path in combination with the given mode. First of all, we need to check if the path exists for this stream protocol. If the path is valid, then we also need to check if the mode is valid for this path. When reading the documentation for fopen(), it is clear that only the "r" mode does not require write access to the path, the other modes do. Write-only modes are "w", "a", "x" and "c". Any other option requires read and write permissions. Based on the mode we should check if we have the respective rights for this path (possibly with is_writeable() and is_readable()).

Stream context: retrieve the options

When opening a stream, for example with fopen(), you can add a stream context to provide the stream wrapper with some extra options and parameters. For example, you can set the HTTP method by creating a stream context with an array of options (each key is the name of a protocol and it's value is a set of options for this protocol):

$context = stream_context_create(array('html' => array('method' => 'POST')));
$responseContent = file_get_contents('https://matthiasnoback.nl', false, $context);

When a stream is opened, the given context will be stored in the stream wrapper's public attribute called $context:

class DOMStreamWrapper
{
    public $context;

    // ...
}

But we should also consider the fact that it's possible to define a default stream context, using

stream_context_set_default(array('html' => array('method' => 'POST')));

So, inside the DOMStreamWrapper we should combine the default options and the options provided by the user when opening the stream. In our case a simple merge suffices:

$defaultContext = stream_context_get_default();
$defaultOptions = stream_context_get_options($defaultContext);
$defaultOptions = isset($defaultOptions['dom']) ? $defaultOptions['dom'] : array();

$givenContext = $this->context; // the public attribute "context" contains the current context
$givenOptions = stream_context_get_options($givenContext);
$givenOptions = isset($givenOptions['dom']) ? $givenOptions['dom'] : array();

$options = array_merge($defaultOptions, $givenOptions);

This way, the options given by the user will override the default options.

Next up: reading and writing the node value

Take a look at the repository containing the full code, let me know what you think and in my next post, we will discuss reading and writing to a DOM stream (i.e. modifying node values).

PHP DOM stream wrapper
Comments
This website uses MailComments: you can send your comments to this post by email. Read more about MailComments, including suggestions for writing your comments (in HTML or Markdown).