Table of Contents | Foreword | Preface
Chapters: 1 2 3 4 5 6 7 8 9 10 11 12
Appendices: A B C D
Glossary | Colophon | Copyright

Chapter 9: Mason and CGI

Although mod_perl is pretty cool, it's not the only way to use Mason to build a web site. In fact, plenty of times it's more advisable to use CGI than mod_perl, as we describe in this chapter. If you find yourself in such a situation, you're in luck -- Mason works just fine under CGI, and special care has gone into making sure the cooperation is smooth. The HTML::Mason::CGIHandler module provides the glue necessary to use Mason in most common CGI environments.

CGI-Appropriate Situations

Before we get into the details of how to set up Mason under CGI, let's think about why you might want to use this setup. After all, isn't mod_perl supposed to be better than CGI? Well, yes and no. As in most things, context is everything. The following factors may conspire to make you choose clunky old

CGI over clunky new mod_perl in a particular situation:

CGI-Inappropriate Situations

In some situations, CGI just won't do. Depending on who you ask, these situations might be characterized with terms ranging from "always" to "never." It's beyond the scope of this book to make all the arguments germane to the CGI versus mod_perl debate, but these factors might make choosing CGI impossible:

Creating a CGI-Based Site in Mason

You can get Mason and CGI to work together in several different ways. One model is to write traditional CGI scripts that use Mason as a templating language, executing Mason components from inside the CGI program. See "Using Mason Templates Inside Regular CGI Scripts" for how to set this up.

A better approach to building a Mason site under CGI is to let the components drive the site. You can configure your web server to invoke a CGI script of your choosing for certain requests, and that script can begin Mason processing on those files. In other words, you can have the same set of Mason components in your site you would have under mod_perl, but those components get executed under the CGI paradigm.

Your comrade in this endeavor is the HTML::Mason::CGIHandler module. Its role is similar to the HTML::Mason::ApacheHandler module, but since CGI is a bit clunkier than mod_perl and the CGIHandler is a bit younger than ApacheHandler, a bit more configuration is necessary. You'll need to combine four ingredients: directives in the server's configuration files (httpd.conf or .htaccess under Apache), a Mason wrapper CGI script, the Mason components themselves, and the HTML::Mason::CGIHandler module.

The necessary configuration directives are fairly straightforward. Here's an example for Apache:

  Action html-mason /cgi-bin/mason_handler.cgi
  <FilesMatch "\.html$">
   SetHandler html-mason
  </FilesMatch>

Here, the mason_handler.cgi script can be located wherever you want, provided it's set up by the server to be run as a CGI script. The /cgi-bin directory is already configured on most systems using the ScriptAlias directive, so that's a reasonable place to put the handler script, though it's certainly not the only place.

Instead of passing all .html files through Mason as in the previous example, you might configure the server to Masonize all files in a certain directory (use a <Directory> block for this or an .htaccess file in that directory), only certain specific files (use a <Files> block or a different <FilesMatch> pattern to select those files), or some more complicated scheme. See your server's documentation for more configuration help. Remember, each CGI request will take a highly nonzero time to execute, so don't process a file with Mason unless it's actually a Mason component. In particular, make sure you don't accidentally pass image files to Mason, because each web page typically contains many images, and the extra processing time for those images will be a big waste if you invoke Mason unnecessarily, not to mention that Mason may mangle those images when processing them.

Next, you need to create your mason_handler.cgi script. It should be located wherever the Action directive indicates in the server configuration. Here's a mason_handler.cgi that will serve nicely for most sites. It's fairly simple, since most of the real work is done inside the HTML::Mason::CGIHandler module.

  #!/usr/bin/perl -w
  
  use strict;
  use HTML::Mason::CGIHandler;
  
  my $h = HTML::Mason::CGIHandler->new
    (
     data_dir  => "$ENV{DOCUMENT_ROOT}/../mason-data",
     allow_globals => [qw(%session $user)],
    );
  
  $h->handle_request;

The data_dir and allow_globals parameters should look familiar; they're just passed along to the Interpreter and Compiler, respectively. Note that the data_dir we use here may need to be changed for your setup. The main consideration is that your data_dir is somewhere outside the document root, so feel free to put it wherever makes sense for you.

Note that we didn't pass a comp_root parameter. If no comp_root is specified, HTML::Mason::CGIHandler will use $ENV{DOCUMENT_ROOT} as the document root.

With the server configuration and handler script in place, you're ready to use Mason. You can create a hierarchy of components for your site just as you would under a mod_perl setup.

Using Mason Templates Inside Regular CGI Scripts

We have argued several times against the traditional CGI model, in which the response to each web request is driven primarily by a Perl script (or other executable program1) that focuses on making all the logical decisions necessary for fulfilling that request. We tend to prefer template-based solutions driven by the content of the request, using concise sprinklings of programming to control the dynamic elements of the request. In other words, we prefer Mason components to CGI scripts.

However, the world is a strange place. For some odd reason, managers may not always be persuaded by the well-reasoned arguments their programmers make in favor of using Mason in its traditional way. They may even want to take an existing functional site based on badly written CGI scripts and use some basic Mason-based templating techniques to achieve the timeless goal of separating logic from presentation. In these situations, you may be called upon to use Mason as if it were one of the lightweight solutions mentioned in Chapter 1.

Luckily, you won't be the first person to want such a thing. This path has been tread often enough that it's fairly easy to use Mason as a standalone templating language. To do this, you create a Mason Interpreter, then call the Interpreter's exec() method, passing it either a component path or component object as the first argument.

The CGI script in Example 9-1 is sort of the "Hello, World" of dynamic web programming. It lets the user enter text in an HTML form, submit the form, and see the resultant text in the server's response.

Example 9-1. Hello, World in Mason with traditional CGI
  #!/usr/bin/perl -w
  
  use strict;
  use CGI;
  use HTML::Mason;
  
  # Create a new query object, and print the standard header
  my $q = CGI->new;
  print $q->header;
  
  # Create a Mason Interpreter
  my $interp = HTML::Mason::Interp->new( );
  
  # Generate a Component object from the given text
  my $component = $interp->make_component(comp_source => <<'EOF');
  <%args>
   $user_input => '(no input)'
  </%args>
  
  <HTML>
  <HEAD><TITLE>You said '<% $user_input |h %>'</TITLE></HEAD>
  <BODY>
  You said '<% $user_input |h %>'.  Type some text below and submit the form.<BR>
  
  <FORM ACTION="" METHOD="GET">
  <INPUT NAME="user_input" value=""><br>
  <INPUT TYPE="submit" VALUE="Submit">
  </FORM>
  </BODY>
  </HTML>
  EOF
  
  my %vars = $q->Vars;
  $vars{user_input} =~ s/^\s+|\s+$//g;  # Sanitize
  
  # Execute the component, with output going to STDOUT
  $interp->exec($component, %vars);

Notice a couple of things about the code. First, the Mason component is located in the middle of the code, surrounded by some fairly generic Perl code to fetch the query parameters and pass them to the component. Second, the Mason Interpreter is the main point of entry for most of the tasks performed. First we create an Interpreter, then we use the Interpreter's make_component() method to create a new Component object (see Chapter 5 for more on the make_component() method), then we call the Interpreter's exec() method to set the Mason wheels in motion.

Also, notice that the example code calls the CGI method Vars() to get at the query parameters. This is relatively convenient but doesn't properly handle multiple key/value pairs with the same key. To do this better, we'd either have to use the CGI param() method and parse out the multiple keys ourselves or split the Vars() values on ASCII \0 (thus disallowing \0 in our data). You're probably not jumping for joy at the prospect of dealing with these kinds of minutiae, but this is the kind of thing you'll find yourself dealing with in CGI environments.

If you don't actually need to examine or alter the query parameters yourself before invoking the Mason template, you can take advantage of the HTML::Mason::CGIHandlerhandle_comp() method, which will create a CGI object and parse out the query parameters, then invoke the component you pass it. Example 9-2 shows the previous example rewritten using the handle_comp() method.

Example 9-2. A lazier approach to Mason in CGI
  #!/usr/bin/perl -w
  
  use strict;
  use HTML::Mason::CGIHandler;
  
  # Create a new CGIHandler object
  my $h = HTML::Mason::CGIHandler->new( );
  
  # Generate a Component object from the given text
  my $component = $h->interp->make_component(comp_source => <<'EOF');
  <%args>
   $user_input => '(no input)'
  </%args>
  
  <HTML>
  <HEAD><TITLE>You said '<% $user_input %>'</TITLE></HEAD>
  <BODY>
  You said '<% $user_input %>'.  Type some text below and submit the form.<BR>
  
  <FORM ACTION="" METHOD="GET">
  <INPUT NAME="user_input" value=""><br>
  <INPUT TYPE="submit" VALUE="Submit">
  </FORM>
  </BODY>
  </HTML>
  EOF
  
  # Invoke the component, with output going to STDOUT
  $h->handle_comp($component);

As you can see, this hides all the CGI argument processing, ensuring that you don't make a silly mistake (or get lazy) in handling the query parameters. It also handles sending the HTTP headers. This approach is usually preferable to the one shown in Example 9-1. Of course, if you're letting Mason handle all the details of the request, you have to wonder why you don't just use the Action directive with a generic CGI wrapper, as covered in "Creating a CGI-Based Site in Mason".

Design Considerations

If you start building a site in this way, with each CGI script invoking Mason as a templating engine, you're going to face some design decisions. For instance, if your code needs to do some argument processing or other decision making that alters the output, should those decisions happen inside or outside the Mason template? If you do a bunch of important stuff outside the template that alters the behavior inside the template, you can create lots of nonobvious logical dependencies that can be a nightmare to maintain. It's somewhat better to put this stuff inside the template, but you run the risk of obscuring the template's real purpose, which is to generate HTML output.

To really make the right kinds of decisions, direct yourself to Chapter 10, in which we try to convince you to use Mason for what Mason is good for and Perl modules for what Perl modules are good for. These design issues don't have much to do with the CGI approach per se, but as you can see from our example script, the flow is already a little convoluted even in the simplest of cases. Anything you can do to keep things tidy may save you a lot of pain later.

Differences Between Mason Under CGI and mod_perl

The main functional difference between the environments provided by HTML::Mason::CGIHandler and HTML::Mason::ApacheHandler is that $r, the Apache request object, is much more limited in functionality under CGI. In fact, under CGI it's not a real Apache request object at all; it just emulates a few of the more useful methods. It can't emulate some methods because they make sense only in a mod_perl environment. For example, you won't be able to access the Apache subrequest mechanism through lookup_uri() or lookup_file(), you won't be able to get at the client connection through the connection() method, and you can't get configuration parameters via dir_config().

However, $r does have methods to help you set headers in the outgoing response, including Location and Content-Type headers. This makes it relatively straightforward to send client-side redirects and to use Mason to generate plain text, XML, image data, or other formats besides the default HTML.

To set outgoing headers, you can use the $r->header_out() and $r->content_type() methods in your components. They are very similar to their mod_perl counterparts of the same names. The header_out() method takes two arguments, the name of a header and the value it should be set to. If you pass only one argument, the header's value won't be set, but the method will return the current value of the header, as set by a previous call to header_out().

The content_type() method is the "official" way to set the content type of the outgoing response. It's essentially just an abbreviation for passing Content-Type as the first argument to the header_out() method. If you pass an argument to content_type(), you'll set the outgoing content type. If you don't set the content type during the request, the CGI module will set the content type to text/html.

Under normal circumstances, header_out() and content_type() just pass along any headers you set to the CGI module's header() method. If you previously set a header that you want to unset, you can pass undef as the new value to header_out() or content_type(). Instead of setting the header's value to undef (which wouldn't make a lot of sense in the HTTP context), the header will be unset (i.e., removed from the table of headers to send to the client).

Like its cousin ApacheHandler, CGIHandler adds an $m->redirect() method to the request object $m, so you can redirect browsers to a URL of your choosing in the same way you would under mod_perl.

Finally, if you want to access the CGI query object for the current request, you may do so by calling the $m->cgi_object method. In general it's best to avoid using the query object directly, because doing so will lead to nonportable code and you most likely won't be taking advantage of Mason's argument-processing and content-generation techniques. However, as with most things Perl, you can always get enough rope, even if it means you might end up in a hopelessly tangled mess, dangling by an ankle from the gallows pole of your own code.

See the documentation for HTML::Mason::CGIHandler for more details.

Footnotes

1. But who are we kidding, eh? Are you going to be writing these things in COBOL? -- Return.