Parsing a consistent request on different environments

At work, I've recently been spending time on a router class for the custom framework that powers our internal administration system. This router will just about everything related to URL requests, including:

  • Parse a request URI into a controller, page and arguments.
  • Receive a URL and redirect the user to that page
  • Define absolute base directories for both server side includes and client side includes.

While working on this controller, I need to consistently figure out what the root folder is. Take these examples, all of which should resolve to the same page:

  • https://ops.domain.com/production/assignment/15
  • https://domain.com/ops/production/assignment/15
  • http://ops/production/assignment/15
  • http://workspace/ops/production/assignment/15

In each of these examples, the domain and subfolders are slightly different. To throw an extra kink into the mix, the production server is using a symlink to point the document root to a different folder. I need to parse those different URLs to extract these parameters:

  • Protocol and domain for sending out redirects as an absolute path
  • Document root for server side includes
  • Subfolder for client side includes

Getting the protocol and domain is fairly trivial. Both are readily available in the $_SERVER superglobal:

$protocol = array_key_exists('HTTPS', $_SERVER) && $_SERVER['HTTPS'] == 'on' ? 
    'https://' : 
    'http://';
$domain = $_SERVER['HTTP_HOST'];

Document root proved to be a bit trickier, because of how the production server was set up. https://domain.com/ops/ is using an alias, while http://ops.domain.com/ used a vhost. This led to inconsistent $_SERVER['DOCUMENT_ROOT'] variables. http://domain.com/ops/ was falsely reporting the document root as /var/www/domain.com/web/ when it should have been /var/www/ops.domain.com/web/. The solution that I found lied in the __FILE__ magic constant, which always reported itself as /var/www/ops.domain.com/web/index.php on the production server:

$document_root = dirname(__FILE__);

This leaves subfolder as the last parameter that I need to extract. Lucky for me, the $_SERVER['SCRIPT_NAME'] variable can give me the information I need with only a slight parsing. I just needed the directory, and could use dirname(), but there is one slight caveat - the trailing slash is not always consistent:

echo dirname('/ops/index.php'); // echo /ops
echo dirname('/index.php'); // echo /

You can see that it only really has the starting slash, but in the case of "/", that also acts as a trailing slash. In order to blindly append it to the end of the domain, I'd like it to have both a starting slash and a trailing slash. A quick ternary operator will handle that:

$folder = dirname($_SERVER['SCRIPT_NAME']);
$folder = ($folder == '/' ? null : $folder) . '/';

This will ensure that I can always blindly append protocol, domain and folder to get the absolute, consistent HTML base for my site. Throw this into a base tag in the HTML head of my site and all other links will be relative to this.

Tags: