Now that you're familiar with Mason's basic syntax and some of its more advanced features, it's time to explore the details of how the various pieces of the Mason architecture work together to process components. By knowing the framework well, you can use its pieces to your advantage, processing components in ways that match your intentions.
In this chapter we'll discuss four of the persistent objects in the
Mason framework: the Interpreter, Resolver, Lexer, and Compiler. These objects
are created once (in a mod_perl
setting, they're typically created when the server is starting up) and
then serve many Mason requests, each of which may involve processing many Mason
components.
Each of these four objects has a distinct purpose. The Resolver is responsible for all interaction with the underlying component source storage mechanism, which is typically a set of directories on a filesystem. The main job of the Resolver is to accept a component path as input and return various properties of the component such as its source, time of last modification, unique identifier, and so on.
The Lexer is responsible for actually processing the component source code and finding the Mason directives within it. It interacts quite closely with the Compiler, which takes the Lexer's output and generates a Mason component object suitable for interpretation at runtime.
The Interpreter ties the other three objects together. It is responsible for taking a component path and arguments and generating the resultant output. This involves getting the component from the resolver, compiling it, then caching the compiled version so that next time the interpreter encounters the same component it can skip the resolving and compiling phases.
Figure 6-1 illustrates the relationship between these four objects. The Interpreter has a Compiler and a Resolver, and the Compiler has a Lexer.
Figure 6-1. The
Interpreter and its cronies
An interesting feature of the Mason code is that, if a particular object contains another object, the containing object will accept constructor parameters intended for the contained object. For example, the Interpreter object will accept parameters intended for the Compiler or Resolver and do the right thing with them. This means that you often don't need to know exactly where a parameter goes. You just pass it to the object at the top of the chain.
Even better, if you decide to create your own Resolver for use with Mason, the Interpreter will take any parameters that your Resolver accepts -- not the parameters defined by Mason's default Resolver class.
Also, if an object creates multiple delayed instances of another class, as the
Interpreter does with Request objects, it will accept the created class's
parameters in the same way, passing them to the created class at the
appropriate time. So if you pass the autoflush
parameter to the Interpreter's constructor, it will store this value and
pass it to any Request objects it creates later.
This system was motivated in part by the fact that many users want to be able
to configure Mason from an Apache config file. Under this system, the user just
sets a certain configuration directive (such as MasonAutoflush
1 to set the autoflush
parameter) in her httpd.conf file, and it gets directed automatically to the Request objects when they are
created.
The details of how this system works are fairly magical and the code involved
is so funky its creators don't know whether to rejoice or weep, but it
works, and you can take advantage of this if you ever need to create your own
custom Mason classes. Chapter 12 covers this in its discussion of the Class::Container
class, where all the funkiness is located.
Mason's built-in
Lexer class is, appropriately enough, HTML::Mason::Lexer
. All it does is parse the text of Mason components and pass off the sections
it finds to the Compiler. As of Version 1.10, the Lexer doesn't actually
accept any parameters that alter its behavior, so there's not much for us
to say in this section.
Future versions of Mason may include other Lexer classes to handle alternate source formats. Some people -- crazy people, we assure you -- have expressed a desire to write Mason components in XML, and it would be fairly simple to plug in a new Lexer class to handle this. If you're one of these crazy people, you may be interested in Chapter 12 to see how to use objects of your own design as pieces of the Mason framework.
By the way, you may be wondering why the Lexer isn't called a Parser, since its main job seems to be to parse the source of a component. The answer is that previous implementations of Mason had a Parser class with a different interface and role, and a different name was necessary to maintain forward (though not backward) compatibility.
By default, Mason will use the
HTML::Mason::Compiler::ToObject
class to do its compilation. It is a subclass of the generic
HTML::Mason::Compiler
class, so we describe here all parameters that the ToObject
variety will accept, including parameters inherited from its parent:
You may want to allow access to certain Perl variables across all components
without declaring or initializing them each time. For instance, you might want
to let all components share access to a $dbh
variable that contains a DBI
database handle, or you might want to allow access to an Apache::Session
%session
variable.
For cases like these, you can set the allow_globals
parameter to an array reference containing the names of any global variables
you want to declare. Think of it like a broadly scoped use vars
declaration; in fact, that's exactly the way it's implemented under
the hood. If you wanted to allow the $dbh
and %session
variables, you would pass an allow_globals
parameter like the following:
allow_globals => ['$dbh', '%session']
Or in an Apache configuration file:
PerlSetVar MasonAllowGlobals $dbh PerlAddVar MasonAllowGlobals %session
The allow_globals
parameter can be used effectively with the Perl local()
function in an autohandler. The top-level autohandler is a convenient place to
initialize global variables, and local()
is exactly the right tool to ensure that they're properly cleaned up at
the end of the request:
# In the top-level autohandler: <%init> # $dbh and %session have been declared using 'allow_globals' local $dbh = DBI->connect(...connection parameters...); local *session; # Localize the glob so the tie() expires properly tie %session, 'Apache::Session::MySQL', Apache::Cookie->fetch->{session_id}->value, { Handle => $dbh, LockHandle => $dbh }; </%init>
Remember, don't go too crazy with globals: too many of them in the same process space can get very difficult to
manage, and in an environment like Mason's, especially under mod_perl
, the process space can be very large and long-lasting. But a few well-placed
and well-scoped globals can make life nice.
This parameter allows you to set a global default for the escape flags in <%
$substitution %>
tags. For instance, if you set default_escape_flags
to 'h
', then all substitution tags in your components will pass through HTML
escaping. If you decide that an individual substitution tag should not obey the default_escape_flag
parameter, you can use the special escape flag 'n
' to ignore the default setting and add whatever additional flags you might
want to employ for that particular substitution tag.
in compiler settings: default_escape_flags => 'h', in a component: You have <% $amount %> clams in your aquarium. This is <% $difference |n %> more than your rival has. <a href="emotion.html?emotion=<% $emotion |nu %>">Visit your <% $emotion %> place!</a> acts as if you had written: You have <% $amount |h %> clams in your aquarium. This is <% $difference %> more than your rival has. <a href="emotion.html?emotion=<% $emotion |u %>">Visit your <% $emotion |h %> place!</a>
By default, all components will be run under Perl's strict
pragma, which forces you to declare any Perl variables you use in your
component. This is a very good feature, as the strict
pragma can help you avoid all kinds of programming slip-ups that may lead to
mysterious and intermittent errors. If, for some sick reason you want to turn
off the strict
pragma for all your components, you can set the use_strict
parameter to a false value and watch all hell get unleashed as you shoot your
Mason application in the foot.
A far better solution is to just insert no strict;
into your code whenever you use a construct that's not allowed under the strict
pragma; this way your casual usage will be allowed in only the smallest
enclosing block (in the worst case, one entire component). Even better would be
to find a way to achieve your goals while obeying the rules of the strict
pragma, because the rules generally enforce good programming practice.
The code written in <%perl>
sections (or other component sections that contain Perl code) must be compiled
in the context of some package, and the default package is HTML::Mason::Commands
.2 To specify a different package, set the in_package
compiler parameter. Under normal circumstances you shouldn't concern
yourself with this package name (almost everything in Mason is done with
lexically scoped my
variables), but for historical reasons you're allowed to change it to
whatever package you want.
Related settings are the Compiler's allow_globals
parameter/method and the Interpreter's set_global()
method. These let you declare and assign to variables in the package you
specified with in_package
, without actually needing to specify that package again by name.
You may also want to control the package name in order to import symbols (subroutines, constants, etc.) for use in components. Although the importing of subroutines seems to be gradually going out of style as people adopt more strict object-oriented programming practices, importing constants is still quite popular, and especially useful in a web context, where various numerical values are used as HTTP status codes. The following example, meant for use in an Apache server configuration file, exports all the common Apache constants so they can be used inside the site's Mason components.
PerlSetVar MasonInPackage My::Application <Perl> { package My::Application; use Apache::Constants qw(:common); } </Perl>
By default, components created by the compiler will be created by calling the HTML::Mason::Component
class's new()
method. If you want the components to be objects of a different class, perhaps
one of your own creation, you may specify a different class name in the comp_class
parameter.
As of Release 1.10 you can redesign Mason on the fly by subclassing one or more of Mason's core classes and extending (or reducing, if that's your game) its functionality. In an informal sense, we speak of Release 1.10 as having made Mason more "pluggable."
By default, Mason creates a Lexer object in the HTML::Mason::Lexer
class. By passing a lexer
parameter to the Compiler, you can specify a different Lexer object with
different behavior. For instance, if you like everything about Mason except for
the syntax it uses for its component files, you could create a Lexer object
that lets you write your components in a format that works well with your
favorite WYSIWYG
HTML editor, in a Python-esque whitespace soup, or however you like.
The lexer
parameter should contain an object that inherits from the HTML::Mason::Lexer
class. As an alternative to creating the object yourself and passing it to the
Compiler, you may instead specify a lexer_class
parameter, and the Compiler will create a new Lexer object for you by calling
the specified package's new()
method. This alternative is often preferable when it's inconvenient to
create new Perl objects, such as when you're configuring Mason from a web
server's configuration file. In this case, you should also pass any
parameters that are needed for your Lexer's new()
method, and they will find their way there.
Several access points let you step in to the compilation process and alter the
text of each component as it gets processed. The preprocess
, postprocess_perl
, postprocess_text
, preamble
, and postamble
parameters let you exert a bit of ad hoc control over Mason's processing
of your components.
Figure 6-2 illustrates the role of each of these five parameters.
Figure 6-2. Component
processing hooks
With the preprocess
parameter, you may specify a reference to a subroutine through which all
components should be preprocessed before the compiler gets hold of them. The
compiler will pass your subroutine the entire text of the component in a scalar
reference. Your subroutine should modify the text in that reference directly --
any return value will be ignored.
The sections of a Mason component can be coarsely divided into three
categories: Perl sections (%
-lines, <%init>
blocks, and so on), sections for special Mason directives (<%args>
blocks, <%flags>
blocks, and so on), and plain text sections (anything outside the other two
types of sections). The Perl and text sections can become part of the
component's final output, whereas the Mason directives control how the
output is created.
Similar to the preprocess
directive, the postprocess_perl
and postprocess_text
directives let you step in and change a component's source before it is
compiled. However, with these directives you're stepping into the action
one step later, after the component source has been divided into the three
types of sections just mentioned. Accordingly, the postprocess_perl
parameter lets you process Perl sections, and the postprocess_text
parameter lets you process text sections. There is no corresponding hook for
postprocessing the special Mason sections.
As with the preprocess
directive, the postprocess
directives should specify a subroutine reference. Mason will pass the
component source sections one at a time (again, as a scalar reference) to the
subroutine you specify, and your subroutine should modify the text in-place.
If you specify a string value for the
preamble
parameter, the text you provide will be prepended to every component that gets
processed with this compiler. The string should contain Perl code, not Mason
code, as it gets inserted verbatim into the component object after compilation.
The default preamble
is the empty string.
The postamble
parameter is just like the preamble
parameter, except that the string you specify will get appended to the
component rather than prepended. Like the preamble
, the default postamble
is the empty string.
One use for preamble
and postamble
might be an
execution trace, in which you log the start and end events of each component.
One potential gotcha: if you have an explicit return
statement in a component, no further code in that component will run,
including code in its postamble. Thus it's not necessarily a good place to
run
cleanup code, unless you're positive you're never going to use return
statements. Cleanup code is usually better placed in an autohandler or similar
location. An alternate trick is to create objects in your preamble code and
rely on their DESTROY
methods to tell you when they're going out of scope.
Once an HTML::Mason::Compiler::ToObject
object is created, the following methods may be invoked. Many of them simply
return the value of a parameter that was passed (or set by default) when the
Compiler was created. Some methods may be used by developers when building a
site, while other methods should be called only by the various other pieces in
the Mason framework. Though you may need to know how the latter methods work if
you start plugging your own modules into the framework, you'll need to read
the Mason documentation to find out more about those
methods, as we don't discuss them here.
The compiler methods are comp_class()
, in_package()
, preamble()
, postamble()
, use_strict()
, allow_globals()
, default_escape_flags()
, preprocess()
, postprocess_perl()
,
postprocess_text()
, and lexer()
.
Each of these methods returns the given property of the Compiler, which was typically set when the Compiler was created. If you pass an argument to these methods, you may also change the given property. One typically doesn't need to change any of the Compiler's properties after creation, but interesting effects could be achieved by doing so:
% my $save_pkg = $m->interp->compiler->in_package; % $m->interp->compiler->in_package('MyApp::OtherPackage'); <& /some/other/component &> % $m->interp->compiler->in_package($save_pkg);
The preceding example will compile the component /some/other/component -- and any components it calls -- in the package MyApp::OtherPackage
rather than the default HTML::Mason::Commands
package or whatever other package you specified using in_package
.
Of course, this technique will work only if /some/other/component actually needs to be compiled at this point in the code; it may already be
compiled and cached in memory or on disk, in which case changing the in_package
property (or any other Compiler property) will have no effect. Because of
this, changing Compiler properties after the Compiler is created is neither a great idea nor
officially supported, but if you know what you're doing, you can use it for
whatever diabolical purposes you have in mind.
The default
Resolver, HTML::Mason::Resolver::File
, finds components and their meta-information (for example, modification date
and file length) on disk. The Resolver is a pretty simple thing, but it's
useful to give it its own place in the pluggable Mason framework because it
allows a developer to use whatever storage mechanism she wants for her
components.
The HTML::Mason::Resolver::File
class accepts only one parameter:
The
comp_root
parameter is Mason's component root. It specifies where components may be
found on disk. It is roughly analogous to Perl's @INC
array or the shell's $PATH
variable. You may specify comp_root
as a string containing the directory in which to search for components or as
an array reference of array references like so:
my $comp_root = [ [web => '/usr/local/httpd/documents'], [shared => '/usr/local/mason/comps'], [custom => '/home/ken/my_components'], ]; my $resolver = HTML::Mason::Resolver::File->new(comp_root => $comp_root);
Every time the Resolver is asked to find a component on disk, it will search these three directories in the given order, as discussed in Chapter 5.
After a Resolver has been created, you may call its comp_root()
method, which returns the value of the comp_root
parameter as it was set at creation time.
If you don't provide a comp_root
parameter, it defaults to something reasonably sensible. In a web context it
defaults to the server's DocumentRoot
; otherwise, it defaults to the current working directory.
The
Interpreter is the center of Mason's universe. It is responsible for
coordinating the activities of the Compiler and Resolver, as well as creating
Request objects. Its main task involves receiving requests for components and
generating the resultant output of those requests. It is also responsible for
several tasks behind the scenes, such as caching components in memory or on
disk. It exposes only a small part of its object API for public use; its
primary interface is via its constructor, the new()
method.
The new()
method accepts lots of parameters. It accepts any parameter that its Resolver
or Compiler (and through the Compiler, the Lexer) classes accept in their new()
methods; these parameters will be transparently passed along to the correct
constructor. It also accepts the following parameters of its own:
This parameter specifies the name that Mason uses for autohandler files. The default name is "autohandler."
This
parameter sets the limit, in bytes, of the in-memory cache for component code.
The default is 10 megabytes (10 * 1024 * 1024). This is not the same thing as
the on-disk cache for component code, which will keep growing without bound
until all components are cached on disk. It is also different from the data
caches, the sizes of which you control through the $m->cache
and $m->cache_self
methods.
This parameter specifies the directory under which Mason stores its various data, such as compiled components, cached data, and so on. This cannot be changed after the Interpreter is created.
Normally, warnings issued during the loading of a component are treated as
fatal errors by Mason. Mason will ignore warnings that match the regular
expression specified in this
parameter. The default setting is qr/Subroutine .* redefined/i
. If you change this parameter, you will probably want to make sure that this
particular warning continues to be ignored, as this allows you to declare named
subroutines in the <%once>
section of components and not cause an error when the component is reloaded
and the subroutine is redefined.
This
parameter takes a list of components to be preloaded when the Interpreter is
created. In a mod_perl
setting this can lead to substantial memory savings and better performance,
since the components will be compiled in the server's parent process and
initially shared among the server children. It also reduces the amount of
processing needed during individual requests, as preloaded components will be
standing at the ready.
The list of components can either be specified by listing each component path
individually or by using glob()
-style patterns to specify several component paths.
Passing a true value for this parameter causes Mason to execute in "static source" mode, which means that it will compile a source file only once, ignoring subsequent changes. In addition, it will resolve a given path only once, so adding or removing components will not be noticed by the interpreter.
If you do want to make changes to components when Mason is in this mode, you
will need to delete all of Mason's object files and, if you are running
Mason under mod_perl
, restart the Apache server.
This mode is useful in order to gain a small performance boost on a heavily trafficked site when your components don't change very often. If you don't need the performance boost, then don't bother turning this mode on, as it just makes for extra administrative work when you change components.
As we mentioned before, each Interpreter object creates a Compiler and a
Resolver object that it works with to serve requests. You can substantially
alter the compilation or resolution tasks by providing your own Compiler or
Resolver when creating the Interpreter, passing them as the values for the compiler
or resolver
parameters. Alternatively, you may pass compiler_class
or resolver_class
parameters (and any arguments required by those classes' new()
methods) and allow the Interpreter to construct the Compiler or Resolver from
the other parameters you specify:
my $interp = HTML::Mason::Interpreter->new ( resolver_class => 'MyApp::Resolver', compiler_class => 'MyApp::Compiler', comp_root => '/home/httpd/docs', # Goes to resolver default_escape_flags => 'h', # Goes to compiler );
By default, the Compiler will be an HTML::Mason::Compiler::ToObject
object, and the Resolver will be an HTML::Mason::Resolver::File
object.
Besides the Interpreter's own parameters, you can pass the Interpreter any parameter that the Request object accepts. These parameters will be saved internally and used as defaults when making a new Request object.
The parameters that can be set are: autoflush
, data_cache_defaults
, dhandler
, error_mode
, error_format
, and out_method
.
Besides accepting these as constructor parameters, the Interpreter also provides get/set accessors for these attribute. Setting these attributes in the interpreter will change the attribute for all future Requests, though it will not change the current Request.
1. All initialization parameters have corresponding Apache configuration names, found by switching from lower_case_with_underscores to StudlyCaps and prepending "Mason." -- Return.
2. This package name is purely historical; it may be changed in the future. -- Return.