Компиляция mod_perl из исходников требует определенных инструментальных средств. Вам понадобится ANSI-компилятор C compiler ( gcc) и утилита make . Во всех стандартных дистрибутивах это есть. Ежели у вас этого нет , их можно установить с помощью пакетных менеджеров , таких как (rpm, apt, yast, etc.).

Также необходим Perl (5.004 или выше). Для проверки наличия инструментального средства нужно использовать опцию -v :

panic% make -v
 panic% gcc -v
 panic% perl -v

Если вы получите Command not found, средство нужно проинсталлировать.

Получить исходники таких пакетов , как Apache и mod_perl можно отсюда http://www.apache.org/ и http://perl.apache.org/.

The two packages are named apache_1.3.xx.tar.gz and mod_perl-1.xx.tar.gz, where 1.3.xx and 1.xx should be replaced with the real version numbers of Apache and mod_perl, respectively. Although 2.0 development versions of Apache and mod_perl are available, this book covers the mod_perl 1.0 and Apache 1.3 generation, which were the stable versions when this book was written. See Chapter 24 and Chapter 25 for more information on the Apache 2.0 and mod_perl 2.0 generation.

Move the downloaded packages into a directory of your choice (for example, /home/stas/src/), proceed with the following steps, and mod_perl will be installed:

panic% cd /home/stas/src
 panic% tar -zvxf apache_1.3.xx.tar.gz
 panic% tar -zvxf mod_perl-1.xx.tar.gz
 panic% cd mod_perl-1.xx
 panic% perl Makefile.PL APACHE_SRC=../apache_1.3.xx/src \
     APACHE_PREFIX=/home/httpd DO_HTTPD=1 USE_APACI=1 EVERYTHING=1
 panic% make && make test
 panic% su
 panic# make install

All that remains is to add a few configuration lines to the Apache configuration file (/usr/local/apache/conf/httpd.conf), start the server, and enjoy mod_perl.

2.2 Installing mod_perl on Unix Platforms

Now let's go over the installation again, this time with each step explained in detail and with some troubleshooting advice. If the build worked and you are in a hurry to boot your new httpd, you may skip to Section 2.4.

Before installing Apache and mod_perl, you usually have to become root so that the files can be installed in a protected area. However, users without root access can still install all files under their home directories by building Apache in an unprivileged location; you need root access only to install it. We will talk about the nuances of this approach in Chapter 3.

2.2.1 Obtaining and Unpacking the Source Code

The first step is to obtain the source code distributions of Apache and mod_perl. These distributions can be retrieved from http://www.apache.org/dist/httpd/ and http://perl.apache.org/dist/ and are also available from mirror sites. Even if you have the Apache server running on your machine, you'll need its source distribution to rebuild it from scratch with mod_perl.

The source distributions of Apache and mod_perl should be downloaded into a directory of your choice. For the sake of consistency, we assume throughout the book that all builds are being done in the /home/stas/src directory. Just remember to substitute /home/stas/src in the examples with the actual path being used.

The next step is to move to the directory containing the source archives:

panic% cd /home/stas/src

Uncompress and untar both sources. GNU tar allows this using a single command per file:

panic% tar -zvxf apache_1.3.xx.tar.gz
 panic% tar -zvxf mod_perl-1.xx.tar.gz

For non-GNU tars, you may need to do this with two steps (which you can combine via a pipe):

panic% gzip -dc apache_1.3.xx.tar.gz | tar -xvf -
 panic% gzip -dc mod_perl-1.xx.tar.gz | tar -xvf -

Linux distributions supply tar and gzip and install them by default. If your machine doesn't have these utilities already installed, you can get tar and gzip from http://www.gnu.org/, among other sources. The GNU versions are available for every platform that Apache supports.

2.2.2 Building mod_perl

Move into the /home/stas/src/mod_perl-1.xx/ source distribution directory:

panic% cd mod_perl-1.xx

The next step is to create the Makefile. This is no different in principle from the creation of the Makefile for any other Perl module.

panic% perl Makefile.PL APACHE_SRC=../apache_1.3.xx/src \
   DO_HTTPD=1 USE_APACI=1 EVERYTHING=1

mod_perl accepts a variety of parameters. The options specified above will enable almost every feature that mod_perl offers. There are many other options for fine-tuning mod_perl to suit particular circumstances; these are explained in detail in Chapter 3.

Running Makefile.PL will cause Perl to check for prerequisites and identify any required software packages that are missing. If it reports missing Perl packages, they will have to be installed before proceeding. Perl modules are available from CPAN (http://cpan.org/) and can easily be downloaded and installed.

An advantage of installing mod_perl with the help of the CPAN.pm module is that all the missing modules will be installed with the Bundle::Apache bundle:

panic% perl -MCPAN -e 'install("Bundle::Apache")'

We will talk in depth about using CPAN.pm in Chapter 3.

Running Makefile.PL also transparently executes the ./configure script from Apache's source distribution directory, which prepares the Apache build configuration files. If parameters must be passed to Apache's ./configure script, they can be passed as options to Makefile.PL. Chapter 3 covers all this in detail.

The httpd executable can now be built by using the make utility (note that the current working directory is still /home/stas/src/mod_perl-1.xx/):

panic% make

This command prepares the mod_perl extension files, installs them in the Apache source tree, and builds the httpd executable (the web server itself) by compiling all the required files. Upon completion of the make process, the working directory is restored to /home/stas/src/mod_perl-1.xx/.

Running make test will execute various mod_perl tests on the newly built httpd executable:

panic% make test

This command starts the server on a nonstandard port (8529) and tests whether all parts of the built server function correctly. The process will report anything that does not work properly.

2.2.3 Installing mod_perl

Running make install completes the installation process by installing all the Perl files required for mod_perl to run. It also installs the mod_perl documentation (manpages). Typically, you need to be root to have permission to do this, but another user account can be used if the appropriate options are set on the perl Makefile.PL command line (see Chapter 3). To become root, use the su command.

panic% su
 panic# make install

If you have the proper permissions, you can also chain all three make commands into a single command line:

panic# make && make test && make install

The single-line version simplifies the installation, since there is no need to wait for each command to complete before starting the next one. Of course, if you need to become root in order to run make install, you'll either need to run make install as a separate command or become root before running the single-line version.

If you choose the all-in-one approach and any of the make commands fail, execution will stop at that point. For example, if make alone fails, then make test and make install will not be attempted. Similarly, if make test fails, then make install will not be attempted.

Finally, change to the Apache source distribution directory and run make install to create the Apache directory tree and install Apache's header files (*.h), default configuration files (*.conf), the httpd executable, and a few other programs:

panic# cd ../apache_1.3.xx
 panic# make install

Note that, as with a plain Apache installation, any configuration files left from a previous installation will not be overwritten by this process. Although backing up is never unwise, it's not actually necessary to back up the previously working configuration files before the installation.

At the end of the make install process, the installation program will list the path to the apachectl utility, which you can use to start and stop the server, and the path to the installed configuration files. It is important to write down these pathnames, as they will be needed frequently when maintaining and configuring Apache. On our machines, these two important paths are:

/usr/local/apache/bin/apachectl
 /usr/local/apache/conf/httpd.conf

The mod_perl Apache server is now built and installed. All that needs to be done before it can be run is to edit the configuration file httpd.conf and write a test script.

2.3 Configuring and Starting the mod_perl Server

Once you have mod_perl installed, you need to configure the server and test it.

The first thing to do is ensure that Apache was built correctly and that it can serve plain HTML files. This helps to minimize the number of possible problem areas: once you have confirmed that Apache can serve plain HTML files, you know that any problems with mod_perl are related to mod_perl itself.

Apache should be configured just as you would configure it without mod_perl. Use the defaults as suggested, customizing only when necessary. Values that will probably need to be customized are ServerName, Port, User, Group, ServerAdmin, DocumentRoot, and a few others. There are helpful hints preceding each directive in the configuration files themselves, with further information in Apache's documentation. Follow the advice in the files and documentation if in doubt.

When the configuration file has been edited, start the server. One of the ways to start and stop the server is to use the apachectl utility. To start the server with apachectl, type:

panic# /usr/local/apache/bin/apachectl start

To stop the server, type:

panic# /usr/local/apache/bin/apachectl stop

Note that if the server will listen on port 80 or another privileged port,^[1] the user executing apachectl must be root.

^[1] Privileged ports are 0-1023. Only the programs running as root are allowed to bind to these.

After the server has started, check in the error_log file (/usr/local/apache/logs/error_log, by default) to see if the server has indeed started. Do not rely on the apachectl status reports. The error_log should contain something like the following:

[Thu Jun 22 17:14:07 2000] [notice] Apache/1.3.12 (Unix) 
 mod_perl/1.24 configured -- resuming normal operations

Now point your browser to http://localhost/ or http://example.com/, as configured with the ServerName directive. If the Port directive has been set with a value other than 80, add this port number to the end of the server name. For example, if the port is 8080, test the server with http://localhost:8080/ or http://example.com:8080/. The "It Worked!" page, which is an index.html file that is installed automatically when running make install in the Apache source tree, should appear in the browser. If this page does not appear, something went wrong and the contents of the logs/error_log file should be checked. The path to the error log file is specified by the ErrorLog directive in httpd.conf. (It is usually specified relative to the ServerRoot, so a value of logs/error_log usually means /usr/local/apache/logs/error_log if Apache is installed into /usr/local/apache.)

If everything works as expected, shut down the server, open httpd.conf with a text editor, and scroll to the end of the file. The mod_perl configuration directives are conventionally added to the end of httpd.conf. It is possible to place mod_perl's configuration directives anywhere in httpd.conf, but adding them at the end seems to work best in practice.

Assuming that all the scripts that should be executed by the mod_perl-enabled server are located in the /home/stas/modperl directory, add the following configuration directives:

Alias /perl/ /home/stas/modperl/
 
 PerlModule Apache::Registry
 <Location /perl/>
     SetHandler perl-script
     PerlHandler Apache::Registry
     Options +ExecCGI
     PerlSendHeader On
     Allow from all
 </Location>

Save the modified file.

This configuration causes every URI starting with /perl to be handled by the Apache mod_perl module with the handler from the Perl module Apache::Registry.

2.5 Preparing the Scripts Directory

Now you have to select a directory where all the mod_perl scripts and modules will be placed. We usually create a directory called modperl under our home directory for this purpose (e.g., /home/stas/modperl), but it is also common to create a directory called perl under your Apache server root, such as /usr/local/apache/perl.

First create this directory if it doesn't yet exist:

panic% mkdir /home/stas/modperl

Next, set the file permissions. Remember that when scripts are executed from a shell, they are being executed with the permissions of the user's account. Usually, you want to have read, write, and execute access for yourself, but only read and execute permissions for the server. When the scripts are run by Apache, however, the server needs to be able to read and execute them. Apache runs under an account specified by the User directive, typically nobody. You can modify the User directive to run the server under your username, for example:

User stas

Since the permissions on all files and directories should usually be rwx------,^[3] set the directory permissions to:

^[3] See the chmod manpage for more information regarding octal modes.

panic% chmod 0700 /home/stas/modperl

Now no one but you and the server can access the files in this directory. You should set the same permissions for all the files you place under this directory. ^[4]

^[4] You don't need to set the x bit for files that aren't going to be executed; mode 0600 is sufficient for those files.

If the server is running under the nobody account, you have to set the permissions to rwxr-xr-x or 0755 for your files and directories. This is insecure, because other users on the same machine can read your files.

panic# chmod 0755  /home/stas/modperl

If you aren't running the server with your username, you have to set these permissions for all the files created under this directory so Apache can read and execute them.

In the following examples, we assume that you run the server under your username, and hence we set the scripts' permissions to 0700.

2.6 A Sample Apache::Registry Script

One of mod_perl's benefits is that it can run existing CGI scripts written in Perl that were previously used under mod_cgi (the standard Apache CGI handler). Indeed, mod_perl can be used for running CGI scripts without taking advantage of any of mod_perl's special features, while getting the benefit of the potentially huge performance boost. Example 2-1 gives an example of a very simple CGI-style mod_perl script.

Example 2-1. mod_perl_rules1.pl

print "Content-type: text/plain\n\n";
 print "mod_perl rules!\n";

Save this script in the /home/stas/modperl/mod_perl_rules1.pl file. Notice that the #! line (colloquially known as the shebang line) is not needed with mod_perl, although having one causes no problems, as can be seen in Example 2-2.

Example 2-2. mod_perl_rules1.pl with shebang line

#!/usr/bin/perl
 print "Content-type: text/plain\n\n";
 print "mod_perl rules!\n";

Now make the script executable and readable by the server, as explained in the previous section:

panic% chmod 0700 /home/stas/modperl/mod_perl_rules1.pl

The mod_perl_rules1.pl script can be tested from the command line, since it is essentially a regular Perl script:

panic% perl /home/stas/modperl/mod_perl_rules1.pl

This should produce the following output:

Content-type: text/plain
 
 mod_perl rules!

Make sure the server is running and issue these requests using a browser:

http://localhost/perl/mod_perl_rules1.pl

If the port being used is not 80 (e.g., 8080), the port number should be included in the URL:

http://localhost:8080/perl/mod_perl_rules1.pl

Also, the localhost approach will work only if the browser is running on the same machine as the server. If not, use the real server name for this test. For example:

http://example.com/perl/mod_perl_rules1.pl

The page rendered should be similar to the one in Figure 2-1.

Figure 2-1. Testing the newly configured server

If you see it, congratulations! You have a working mod_perl server.

If something went wrong, go through the installation process again, making sure that none of the steps are missed and that each is completed successfully. You might also look at the error_log file for error messages. If this does not solve the problem, Chapter 3 will attempt to salvage the situation.

Jumping a little bit ahead, Example 2-3 shows the same CGI script written with the mod_perl API.

Example 2-3. mod_perl_rules2.pl

my $r = Apache->request;
 $r->send_http_header('text/plain');
 $r->print("mod_perl rules!\n");

The mod_perl API needs a request object, $r, to communicate with Apache. The script retrieves this object and uses it to send the HTTP header and print the irrefutable fact about mod_perl's coolness.

This script generates the same output as the previous one.

As you can see, it's not much harder to write your code using the mod_perl API. You need to learn the API, but the concepts are the same. As we will show in the following chapters, usually you will want to use the mod_perl API for better performance or when you need functionality that CGI doesn't provide.

2.6.1 Porting Existing CGI Scripts to mod_perl

Now it's time to move any existing CGI scripts from the /somewhere/cgi-bin directory to /home/stas/modperl. Once moved, they should run much faster when requested from the newly configured base URL (/perl/). For example, a CGI script called test.pl that was previously accessed as /cgi-bin/test.pl can now be accessed as /perl/test.pl under mod_perl and the Apache::Registry module.

Some of the scripts might not work immediately and may require some minor tweaking or even a partial rewrite to work properly with mod_perl. We will talk in depth about these issues in Chapter 6. Most scripts that have been written with care and developed with warnings enabled and the strict pragma^[5] will probably work without any modifications at all.

^[5] Warnings and strict abort your script if you have written sloppy code, so that you won't be surprised by unknown, hidden bugs. Using them is generally considered a good thing in Perl and is very important in mod_perl.

A quick solution that avoids most rewriting or editing of existing scripts that do not run properly under Apache::Registry is to run them under Apache::PerlRun. This can be achieved by simply replacing Apache::Registry with Apache::PerlRun in httpd.conf. Put the following configuration directives instead in httpd.conf and restart the server:

Alias /perl/ /home/stas/modperl/
 PerlModule Apache::PerlRun
 <Location /perl/>
     SetHandler perl-script
     PerlHandler Apache::PerlRun
     Options ExecCGI
     PerlSendHeader On
     Allow from all
 </Location>

Almost every script should now run without problems; the few exceptions will almost certainly be due to the few minor limitations that mod_perl or its handlers have, but these are all solvable and covered in Chapter 6.

As we saw in Chapter 1, Apache::PerlRun is usually useful while transitioning scripts to run properly under Apache::Registry. However, we don't recommend using Apache::PerlRun in the long term; although it is significantly faster than mod_cgi, it's still not as fast as Apache::Registry and mod_perl handlers.

2.7 A Simple mod_perl Content Handler

As we mentioned in the beginning of this chapter, mod_perl lets you run both scripts and handlers. The previous example showed a script, which is probably the most familiar approach to web programming, but the more advanced use of mod_perl involves writing handlers. Have no fear; writing handlers is almost as easy as writing scripts and offers a level of access to Apache's internals that is simply not possible with conventional CGI scripts.

To create a mod_perl handler module, all that is necessary is to wrap the code that would have been the body of a script into a handler subroutine, add a statement to return the status to the server when the subroutine has successfully completed, and add a package declaration at the top of the code.

Just as with scripts, the familiar CGI API may be used. Example 2-4 shows an example.

Example 2-4. ModPerl/Rules1.pm

package ModPerl::Rules1;
 use Apache::Constants qw(:common);
 
 sub handler {
     print "Content-type: text/plain\n\n";
     print "mod_perl rules!\n";
     return OK; # We must return a status to mod_perl
 }
 1; # This is a perl module so we must return true to perl

Alternatively, the mod_perl API can be used. This API provides almost complete access to the Apache core. In the simple example used here, either approach is fine, but when lower-level access to Apache is required, the mod_perl API shown in Example 2-5 must be used.

Example 2-5. ModPerl/Rules2.pm

package ModPerl::Rules2;
 use Apache::Constants qw(:common);
 
 sub handler {
     my $r = shift;
     $r->send_http_header('text/plain');
     $r->print("mod_perl rules!\n");
     return OK; # We must return a status to mod_perl
 }
 1; # This is a perl module so we must return true to perl

Create a directory called ModPerl under one of the directories in @INC (e.g., under /usr/lib/perl5/site_perl/5.6.1), and put Rules1.pm and Rules2.pm into it. (Note that you will need root access in order to do this.) The files should include the code from the above examples. To find out what the @INC directories are, execute:

panic% perl -le 'print join "\n", @INC'

On our machine it reports:

/usr/lib/perl5/5.6.1/i386-linux
 /usr/lib/perl5/5.6.1
 /usr/lib/perl5/site_perl/5.6.1/i386-linux
 /usr/lib/perl5/site_perl/5.6.1
 /usr/lib/perl5/site_perl
 .

Therefore, on our machine, we might place the files in the directory /usr/lib/perl5/site_perl/5.6.1/ModPerl. By default, when you work as root, the files are created with permissions allowing everybody to read them, so here we don't have to adjust the file permissions (the server only needs to be able to read those).

Now add the following snippet to /usr/local/apache/conf/httpd.conf, to configure mod_perl to execute the ModPerl::Rules1::handler subroutine whenever a request to mod_perl_rules1 is made:

PerlModule ModPerl::Rules1
 <Location /mod_perl_rules1>
    SetHandler perl-script
    PerlHandler ModPerl::Rules1
    PerlSendHeader On
 </Location>

Now issue a request to:

http://localhost/mod_perl_rules1

and, just as with the mod_perl_rules.pl scripts, the following should be rendered as a response:

mod_perl rules!

Don't forget to include the port number if not using port 80 (e.g., http://localhost:8080/mod_perl_rules1); from now on, we will assume you know this.

To test the second module, ModPerl::Rules2, add a similar configuration, while replacing all 1s with 2s:

PerlModule ModPerl::Rules2
 <Location /mod_perl_rules2>
    SetHandler perl-script
    PerlHandler ModPerl::Rules2
 </Location>

In Chapter 4 we will explain why the PerlSendHeader directive is not needed for this particular module.

To test, use the URI:

http://localhost/mod_perl_rules2

You should see the same response from the server that we saw when issuing a request for the former mod_perl handler.

Chapter 3. Installing mod_perl

In Chapter 2, we presented a basic mod_perl installation. In this chapter, we will talk about various ways in which mod_perl can be installed (using a variety of installation parameters), as well as prepackaged binary installations, and more.

Chapter 2 showed you the following commands to build and install a basic mod_perl-enabled Apache server on almost any standard flavor of Unix.

First, download http://www.apache.org/dist/httpd/apache_1.3.xx.tar.gz and http://perl.apache.org/dist/mod_perl-1.xx.tar.gz. Then, issue the following commands:

panic% cd /home/stas/src
 panic% tar xzvf apache_1.3.xx.tar.gz
 panic% tar xzvf mod_perl-1.xx.tar.gz
 panic% cd mod_perl-1.xx
 panic% perl Makefile.PL APACHE_SRC=../apache_1.3.xx/src \
   DO_HTTPD=1 USE_APACI=1 EVERYTHING=1
 panic% make && make test
 panic# make install
 panic# cd ../apache_1.3.xx
 panic# make install

As usual, replace 1.xx and 1.3.xx with the real version numbers of mod_perl and Apache, respectively.

You can then add a few configuration lines to httpd.conf (the Apache configuration file), start the server, and enjoy mod_perl. This should work just fine. Why, then, are you now reading a 50-page chapter on installing mod_perl?

You're reading this chapter for the same reason you bought this book. Sure, the instructions above will get you a working version of mod_perl. But the average reader of this book won't want to stop there. If you're using mod_perl, it's because you want to improve the performance of your web server. And when you're concerned with performance, you're always looking for ways to eke a little bit more out of your server. In essence, that's what this book is about: getting the most out of your mod_perl-enabled Apache server. And it all starts at the beginning, with the installation of the software.

In the basic mod_perl installation, the parameter EVERYTHING=1 enables a lot of options for you, whether you actually need them or not. You may want to enable only the required options, to squeeze even more juice out of mod_perl. You may want to build mod_perl as a loadable object instead of compiling it into Apache, so that it can be upgraded without rebuilding Apache itself. You may also want to install other Apache components, such as PHP or mod_ssl, alongside mod_perl.

To accomplish any of these tasks, you will need to understand various techniques for mod_perl configuration and building. You need to know what configuration parameters are available to you and when and how to use them.

As with Perl, in mod_perl simple things are simple. But when you need to accomplish more complicated tasks, you may have to invest some time to gain a deeper understanding of the process. In this chapter, we will take the following route. We'll start with a detailed explanation of the four stages of the mod_perl installation process, then continue on with the different paths each installation might take according to your goal, followed by a few copy-and-paste real-world installation scenarios. Toward the end of the chapter we will show you various approaches that might make the installation easier, by automating most of the steps. Finally, we'll cover some of the general issues that new users might stumble on while installing mod_perl.

3.1 Configuring the Source

Before building and installing mod_perl you will have to configure it, as you would configure any other Perl module:

panic% perl Makefile.PL [parameters].

Perl Installation Requirements

Make sure you have Perl installed! Use the latest stable version, if possible. To determine your version of Perl, run the following command on the command line:

panic% perl -v

You will need at least Perl Version 5.004. If you don't have it, install it. Follow the instructions in the distribution's INSTALL file. The only thing to watch for is that during the configuration stage (while running ./Configure) you make sure you can dynamically load Perl module extensions. That is, answer YES to the following question:

Do you wish to use dynamic loading? [y]

In this section, we will explain each of the parameters accepted by the Makefile.PL file for mod_perl First, however, lets talk about how the mod_perl configuration dovetails with Apache's configuration. The source configuration mechanism in Apache 1.3 provides four major features (which of course are available to mod_perl):

Apache modules can use per-module configuration scripts to link themselves into the Apache configuration process. This feature lets you automatically adjust the configuration and build parameters from the Apache module sources. It is triggered by ConfigStart/ConfigEnd sections inside modulename.module files (e.g., see the file libperl.module in the mod_perl distribution).
The APache AutoConf-style Interface (APACI) is the top-level configure script from Apache 1.3; it provides a GNU Autoconf-style interface to the Apache configuration process. APACI is useful for configuring the source tree without manually editing any src/Configuration files. Any parameterization can be done via command-line options to the configure script. Internally, this is just a nifty wrapper over the old src/Configure script.
Since Apache 1.3, APACI is the best way to install mod_perl as cleanly as possible. However, the complete Apache 1.3 source configuration mechanism is available only under Unix at this writingit doesn't work on Win32.
Dynamic shared object (DSO) support is one of the most interesting features in Apache 1.3. It allows Apache modules to be built as so-called DSOs (usually named modulename.so), which can be loaded via the LoadModule directive in Apache's httpd.conf file. The benefit is that the modules become part of the httpd executable only on demand; they aren't loaded into the address space of the httpd executable until the user asks for them to be. The benefits of DSO support are most evident in relation to memory consumption and added flexibility (in that you won't have to recompile your httpd each time you want to add, remove, or upgrade a module).
The DSO mechanism is provided by Apache's mod_so module, which needs to be compiled into the httpd binary with:
```
panic% ./configure --enable-module=so
```
The usage of any enable-shared option automatically implies an enable-module=so option, because the bootstrapping module mod_so is always needed for DSO support. So if, for example, you want the module mod_dir to be built as a DSO, you can write:
```
panic% ./configure --enable-shared=dir
```
and the DSO support will be added automatically.
The APache eXtension Support tool (APXS) is a tool from Apache 1.3 that can be used to build an Apache module as a DSO even outside the Apache source tree. APXS is to Apache what MakeMaker and XS are to Perl.^[1] It knows the platform-dependent build parameters for making DSO files and provides an easy way to run the build commands with them.

^[1] MakeMaker allows easy, automatic configuration, building, testing, and installation of Perl modules, while XS allows you to call functions implemented in C/C++ from Perl code.

Pros and Cons of Building mod_perl as a DSO

As of Apache 1.3, the configuration system supports two optional features for taking advantage of the modular DSO approach: compilation of the Apache core program into a DSO library for shared usage, and compilation of the Apache modules into DSO files for explicit loading at runtime.

Should you build mod_perl as a DSO? Let's study the pros and cons of this installation method, so you can decide for yourself.

Pros:

The server package is more flexible because the actual server executable can be assembled at runtime via LoadModule configuration commands in httpd.conf instead of via AddModule commands in the Configuration file at build time. This allows you to run different server instances (e.g., standard and SSL servers, or servers with and without mod_perl) with only one Apache installation; the only thing you need is different configuration files (or, by judicious use of IfDefine, different startup scripts).

The server package can easily be extended with third-party modules even after installation. This is especially helpful for vendor package maintainers who can create an Apache core package and additional packages containing extensions such as PHP, mod_perl, mod_fastcgi, etc.

DSO support allows easier Apache module prototyping, because with the DSO/APXS pair you can work outside the Apache source tree and need only an apxs -i command followed by an apachectl restart to bring a new version of your currently developed module into the running Apache server.

Cons:

The DSO mechanism cannot be used on every platform, because not all operating systems support shared libraries.

The server starts up approximately 20% slower because of the overhead of the symbol-resolving the Unix loader now has to do.

The server runs approximately 5% slower on some platforms, because position-independent code (PIC) sometimes needs complicated assembler tricks for relative addressing, which are not necessarily as fast as those for absolute addressing.

Because DSO modules cannot be linked against other DSO-based libraries (ld -lfoo) on all platforms (for instance, a.out-based platforms usually don't provide this functionality, while ELF-based platforms do), you cannot use the DSO mechanism for all types of modules. In other words, modules compiled as DSO files are restricted to use symbols only from the Apache core, from the C library (libc) and from any other dynamic or static libraries used by the Apache core, or from static library archives (libfoo.a) containing position-independent code. The only way you can use other code is to either make sure the Apache core itself already contains a reference to it, load the code yourself via dlopen( ), or enable the SHARED_CHAIN rule while building Apache (if your platform supports linking DSO files against DSO libraries). This, however, won't be of much significance to you if you're writing modules only in Perl.

Under some platforms (e.g., many SVR4 systems), there is no way to force the linker to export all global symbols for use in DSOs when linking the Apache httpd executable program. But without the visibility of the Apache core symbols, no standard Apache module could be used as a DSO. The only workaround here is to use the SHARED_CORE feature, because in this way the global symbols are forced to be exported. As a consequence, the Apache src/Configure script automatically enforces SHARED_CORE on these platforms when DSO features are used in the Configuration file or on the configure command line.

Together, these four features provide a way to integrate mod_perl into Apache in a very clean and smooth way. No patching of the Apache source tree is usually required, and for APXS support, not even the Apache source tree is needed.

To benefit from the above features, a hybrid build environment was created for the Apache side of mod_perl. See Section 3.5, later in this chapter, for details.

Once the overview of the four building steps is complete, we will return to each of the above configuration mechanisms when describing different installation passes.

3.1.1 Controlling the Build Process

The configuration stage of the build is performed by the command perl Makefile.PL, which accepts various parameters. This section covers all of the configuration parameters, grouped by their functionality.

Of course, you should keep in mind that these options are cumulative. We display only one or two options being used at once, but you should use the ones you want to enable all at once, in one call to perl Makefile.PL.

APACHE_SRC, DO_HTTPD, NO_HTTPD, PREP_HTTPD

These four parameters are tightly interconnected, as they control the way in which the Apache source is handled.

Typically, when you want mod_perl to be compiled statically with Apache without adding any extra components, you specify the location of the Apache source tree using the APACHE_SRC parameter and use the DO_HTTPD=1 parameter to tell the installation script to build the httpd executable:

panic% perl Makefile.PL APACHE_SRC=../apache_1.3.xx/src DO_HTTPD=1

If no APACHE_SRC is specified, Makefile.PL makes an intelligent guess by looking at the directories at the same level as the mod_perl sources and suggesting a directory with the highest version of Apache found there.

By default, the configuration process will ask you to confirm whether the location of the source tree is correct before continuing. If you use DO_HTTPD=1 or NO_HTTPD=1, the first Apache source tree found or the one you specified will be used for the rest of the build process.

If you don't use DO_HTTPD=1, you will be prompted by the following question:

Shall I build httpd in ../apache_1.3.xx/src for you?

Note that if you set DO_HTTPD=1 but do not use APACHE_SRC=../apache_1.3.xx/src, the first Apache source tree found will be used to configure and build against. Therefore, you should always use an explicit APACHE_SRC parameter, to avoid confusion.

If you don't want to build the httpd in the Apache source tree because you might need to add extra third-party modules, you should use NO_HTTPD=1 instead of DO_HTTPD=1. This option will install all the files that are needed to build mod_perl in the Apache source tree, but it will not build httpd itself.

PREP_HTTPD=1 is similar to NO_HTTPD=1, but if you set this parameter you will be asked to confirm the location of the Apache source directory even if you have specified the APACHE_SRC parameter.

If you choose not to build the binary, you will have to do that manually. Building an httpd binary is covered in an upcoming section. In any case, you will need to run make install in the mod_perl source tree so the Perl side of mod_perl will be installed. Note that mod_perl's make test won't work until you have built the server.

APACHE_HEADER_INSTALL

When Apache and mod_perl are installed, you may need to build other Perl modules that use Apache C functions, such as HTML::Embperl or Apache::Peek. These modules usually will fail to build if Apache header files aren't installed in the Perl tree. By default, the Apache source header files are installed into the $Config{sitearchexp}/auto/Apache/include directory.^[2] If you don't want or need these headers to be installed, you can change this behavior by using the APACHE_HEADER_INSTALL=0 parameter.

^[2] %Config is defined in the Config.pm file in your Perl installation.

USE_APACI

The USE_APACI parameter tells mod_perl to configure Apache using the flexible APACI. The alternative is the older system, which required a file named src/Configuration to be edited manually. To enable APACI, use:

panic% perl Makefile.PL USE_APACI=1

APACI_ARGS

When you use the USE_APACI=1 parameter, you can tell Makefile.PL to pass any arguments you want to the Apache ./configure utility. For example:

panic% perl Makefile.PL USE_APACI=1 \
     APACI_ARGS='--sbindir=/home/httpd/httpd_perl/sbin, \
         --sysconfdir=/home/httpd/httpd_perl/etc'

Note that the APACI_ARGS argument must be passed as a single long line if you work with a C-style shell (such as csh or tcsh), as those shells seem to corrupt multi-lined values enclosed inside single quotes.

Of course, if you want the default Apache directory layout but a different root directory (/home/httpd/httpd_perl/, in our case), the following is the simplest way to do so:

panic% perl Makefile.PL USE_APACI=1 \
     APACI_ARGS='--prefix=/home/httpd/httpd_perl'

ADD_MODULE

This parameter enables building of built-in Apache modules. For example, to enable the mod_rewrite and mod_proxy modules, you can do the following:

panic% perl Makefile.PL ADD_MODULE=proxy,rewrite

If you are already using APACI_ARGS, you can add the usual Apache ./configure directives as follows:

panic% perl Makefile.PL USE_APACI=1 \
     APACI_ARGS='--enable-module=proxy --enable-module=rewrite'

APACHE_PREFIX

As an alternative to:

APACI_ARGS='--prefix=/home/httpd/httpd_perl'

you can use the APACHE_PREFIX parameter. When USE_APACI is enabled, this attribute specifies the same prefix option.

Additionally, the APACHE_PREFIX option automatically executes make install in the Apache source directory, which makes the following commands:

panic% perl Makefile.PL APACHE_SRC=../apache_1.3.xx/src \
     DO_HTTPD=1 USE_APACI=1 EVERYTHING=1 \
     APACI_ARGS='--prefix=/home/httpd/httpd_perl'
 panic% make && make test
 panic# make install
 panic# cd ../apache_1.3.xx
 panic# make install

equivalent to these commands:

panic% perl Makefile.PL APACHE_SRC=../apache_1.3.xx/src \
     DO_HTTPD=1 USE_APACI=1 EVERYTHING=1 \
     APACHE_PREFIX=/home/httpd/httpd_perl
 panic% make && make test
 panic# make install

PERL_STATIC_EXTS

Normally, if a C code extension is statically linked with Perl, it is listed in Config.pm's $Config{static_exts}, in which case mod_perl will also statically link this extension with httpd. However, if an extension is statically linked with Perl after it is installed, it will not be listed in Config.pm. You can either edit Config.pm and add these extensions, or configure mod_perl like this:

panic% perl Makefile.PL "PERL_STATIC_EXTS=DBI DBD::Oracle"

DYNAMIC

This option tells mod_perl to build the Apache::* API extensions as shared libraries. The default is to link these modules statically with the httpd executable. This can save some memory if you use these API features only occasionally. To enable this option, use:

panic% perl Makefile.PL DYNAMIC=1

USE_APXS

If this option is enabled, mod_perl will be built using the APXS tool. This tool is used to build C API modules in a way that is independent of the Apache source tree. mod_perl will look for the apxs executable in the location specified by WITH_APXS; otherwise, it will check the bin and sbin directories relative to APACHE_PREFIX. To enable this option, use:

panic% perl Makefile.PL USE_APXS=1

WITH_APXS

This attribute tells mod_perl the location of the apxs executable. This is necessary if the binary cannot be found in the command path or in the location specified by APACHE_PREFIX. For example:

panic% perl Makefile.PL USE_APXS=1 WITH_APXS=/home/httpd/bin/apxs

USE_DSO

This option tells mod_perl to build itself as a DSO. Although this reduces the apparent size of the httpd executable on disk, it doesn't actually reduce the memory consumed by each httpd process. This is recommended only if you are going to be using the mod_perl API only occasionally, or if you wish to experiment with its features before you start using it in a production environment. To enable this option, use:

panic% perl Makefile.PL USE_DSO=1

SSL_BASE

When building against a mod_ssl-enabled server, this option will tell Apache where to look for the SSL include and lib subdirectories. For example:

panic% perl Makefile.PL SSL_BASE=/usr/share/ssl

PERL_DESTRUCT_LEVEL={1,2}

When the Perl interpreter shuts down, this level enables additional checks during server shutdown to make sure the interpreter has done proper bookkeeping. The default is 0. A value of 1 enables full destruction, and 2 enables full destruction with checks. This value can also be changed at runtime by setting the environment variable PERL_DESTRUCT_LEVEL. We will revisit this parameter in Chapter 5.

PERL_TRACE

To enable mod_perl debug tracing, configure mod_perl with the PERL_TRACE option:

panic% perl Makefile.PL PERL_TRACE=1

To see the diagnostics, you will also need to set the MOD_PERL_TRACE environment variable at runtime.

We will use mod_perl configured with this parameter enabled to show a few debugging techniques in Chapter 21.

PERL_DEBUG

This option builds mod_perl and the Apache server with C source code debugging enabled (the -g switch). It also enables PERL_TRACE, sets PERL_DESTRUCT_LEVEL to 2, and links against the debuggable libperld Perl interpreter if one has been installed. You will be able to debug the Apache executable and each of its modules with a source-level debugger, such as the GNU debugger gdb. To enable this option, use:

panic% perl Makefile.PL PERL_DEBUG=1

We will discuss this option in Chapter 21, as it is extremely useful to track down bugs or report problems.

3.1.2 Activating Callback Hooks

A callback hook (also known simply as a callback) is a reference to a subroutine. In Perl, we create subroutine references with the following syntax:

$callback = \&subroutine;

In this example, $callback contains a reference to the subroutine called subroutine. Another way to create a callback is to use an anonymous subroutine:

$callback = sub { 'some code' };

Here, $callback contains a reference to the anonymous subroutine. Callbacks are used when we want some action (subroutine call) to occur when some event takes place. Since we don't know exactly when the event will take place, we give the event handler a reference to the subroutine we want to be executed. The handler will call our subroutine at the right time, effectively calling back that subroutine.

By default, most of the callback hooks except for PerlHandler, PerlChildInitHandler, PerlChildExitHandler, PerlConnectionApi, and PerlServerApi are turned off. You may enable them via options to Makefile.PL.

Here is the list of available hooks and the parameters that enable them. The Apache request prcessing phases were explained in Chapter 1.

Directive/Hook              Configuration Option
 --------------------------------------------------------
 PerlPostReadRequestHandler  PERL_POST_READ_REQUEST
 PerlTransHandler            PERL_TRANS
 PerlInitHandler             PERL_INIT
 PerlHeaderParserHandler     PERL_HEADER_PARSER
 PerlAuthenHandler           PERL_AUTHEN
 PerlAuthzHandler            PERL_AUTHZ
 PerlAccessHandler           PERL_ACCESS
 PerlTypeHandler             PERL_TYPE
 PerlFixupHandler            PERL_FIXUP
 PerlHandler                 PERL_HANDLER
 PerlLogHandler              PERL_LOG
 PerlCleanupHandler          PERL_CLEANUP
 PerlChildInitHandler        PERL_CHILD_INIT
 PerlChildExitHandler        PERL_CHILD_EXIT
 PerlDispatchHandler         PERL_DISPATCH

As with any parameters that are either defined or not, use OPTION_FOO=1 to enable them (e.g., PERL_AUTHEN=1).

To enable all callback hooks, use:

ALL_HOOKS=1

There are a few more hooks that won't be enabled by default, because they are experimental.

If you are using:

panic% perl Makefile.PL EVERYTHING=1 ...

it already includes the ALL_HOOKS=1 option.

3.1.3 Activating Standard API Features

The following options enable various standard features of the mod_perl API. While not absolutely needed, they're very handy and there's little penalty in including them. Unless specified otherwise, these options are all disabled by default. The EVERYTHING=1 or DYNAMIC=1 options will enable them en masse. If in doubt, include these.

PERL_FILE_API=1: Enables the Apache::File class, which helps with the handling of files under mod_perl.
PERL_TABLE_API=1: Enables the Apache::Table class, which provides tied access to the Apache Table structure (used for HTTP headers, among others).
PERL_LOG_API=1: Enables the Apache::Log class. This class allows you to access Apache's more advanced logging features.
PERL_URI_API=1: Enables the Apache::URI class, which deals with the parsing of URIs in a similar way to the Perl URI::URL module, but much faster.
PERL_UTIL_API=1: Enables the Apache::Util class, allowing you to use various functions such as HTML escaping or date parsing, but implemented in C.
PERL_CONNECTION_API=1: Enables the Apache::Connection class. This class is enabled by default. Set the option to 0 to disable it.
PERL_SERVER_API=1: Enables the Apache::Server class. This class is enabled by default. Set the option to 0 to disable it.

Please refer to Lincoln Stein and Doug MacEachern's Writing Apache Modules with Perl and C (O'Reilly) for more information about the Apache API.

3.1.4 Enabling Extra Features

mod_perl comes with a number of other features. Most of them are disabled by default. This is the list of features and options to enable them:

<Perl> sections give you a way to configure Apache using Perl code in the httpd.conf file itself. See Chapter 4 for more information.
```
panic% perl Makefile.PL PERL_SECTIONS=1 ...
```
With the PERL_SSI option, the mod_include module can be extended to include a #perl directive.
```
panic% perl Makefile.PL PERL_SSI=1
```
By enabling PERL_SSI, a new #perl element is added to the standard mod_include functionality. This element allows server-side includes to call Perl subroutines directly. This feature works only when mod_perl is not built as a DSO (i.e., when it's built statically).
If you develop an Apache module in Perl and you want to create custom configuration directives^[3] to be recognized in httpd.conf, you need to use Apache::ModuleConfig and Apache::CmdParms. For these modules to work, you will need to enable this option:

^[3] See Chapters 8 and 9 of Writing Apache Modules with Perl and C (O'Reilly).
```
panic% perl Makefile.PL PERL_DIRECTIVE_HANDLERS=1
```
The stacked handlers feature explained in Chapter 4 requires this parameter to be enabled:
```
panic% perl Makefile.PL PERL_STACKED_HANDLERS=1
```
The method handlers feature discussed in Chapter 4 requires this parameter to be enabled:
```
panic% perl Makefile.PL PERL_METHOD_HANDLERS=1
```
To enable all phase callback handlers, all API modules, and all miscellaneous features, use the "catch-all" option we used when we first compiled mod_perl:
```
panic% perl Makefile.PL EVERYTHING=1
```

3.1.5 Reusing Configuration Parameters

When you have to upgrade the server, it's sometimes hard to remember what parameters you used in the previous mod_perl build. So it's a good idea to save them in a file.

One way to save parameters is to create a file (e.g., ~/.mod_perl_build_options) with the following contents:

APACHE_SRC=../apache_1.3.xx/src DO_HTTPD=1 USE_APACI=1 \
 EVERYTHING=1

Then build the server with the following command:

panic% perl Makefile.PL `cat ~/.mod_perl_build_options`
 panic% make && make test
 panic# make install

But mod_perl has a standard method to perform this trick. If a file named makepl_args.mod_perl is found in the same directory as the mod_perl build location, it will be read in by Makefile.PL. Parameters supplied at the command line will override the parameters given in this file.

The makepl_args.mod_perl file can also be located in your home directory or in the ../ directory relative to the mod_perl distribution directory. The filename can also start with a dot (.makepl_args.mod_perl), so you can keep it nicely hidden along with the rest of the dot files in your home directory. So, Makefile.PL will look for the following files (in this order), using the first one it comes across:

./makepl_args.mod_perl
 ../makepl_args.mod_perl
 ./.makepl_args.mod_perl
 ../.makepl_args.mod_perl
 $ENV{HOME}/.makepl_args.mod_perl

For example:

panic% ls -1 /home/stas/src
 apache_1.3.xx/
 makepl_args.mod_perl
 mod_perl-1.xx/
 
 panic% cat makepl_args.mod_perl
 APACHE_SRC=../apache_1.3.xx/src
 DO_HTTPD=1
 USE_APACI=1
 EVERYTHING=1
 
 panic% cd mod_perl-1.xx
 panic% perl Makefile.PL
 panic% make && make test
 panic# make install

Now the parameters from the makepl_args.mod_perl file will be used automatically, as if they were entered directly.

In the sample makepl_args.mod_perl file in the eg/ directory of the mod_perl distribution package, you might find a few options enabling some experimental features for you to play with, too!

If you are faced with a compiled Apache and no trace of the parameters used to build it, you can usually still find them if make clean was not run on the sources. You will find the Apache-specific parameters in apache_1.3.xx/config.status and the mod_perl parameters in mod_perl-1.xx/apaci/mod_perl.config.

3.1.6 Discovering Whether a Feature Was Enabled

mod_perl Version 1.25 introduced Apache::MyConfig, which provides access to the various hooks and features set when mod_perl was built. This circumvents the need to set up a live server just to find out if a certain callback hook is available.

To see whether some feature was built in or not, check the %Apache::MyConfig::Setup hash. For example, suppose we install mod_perl with the following options:

panic% perl Makefile.PL EVERYTHING=1

but the next day we can't remember which callback hooks were enabled. We want to know whether the PERL_LOG callback hook is available. One of the ways to find an answer is to run the following code:

panic% perl -MApache::MyConfig -e 'print $Apache::MyConfig::Setup{PERL_LOG}'

If it prints 1, that means the PERL_LOG callback hook is enabled (which it should be, as EVERYTHING=1 enables them all).

Another approach is to configure Apache::Status (see Chapter 9) and run http://localhost/perl-status?hooks to check for enabled hooks.

If you want to check for the existence of various hooks within your handlers, you can use the script shown in Example 3-1.

Example 3-1. test_hooks.pl

use mod_perl_hooks;
 
 for my $hook (mod_perl::hooks( )) {
     if (mod_perl::hook($hook)) {
         print "$hook is enabled\n";
     }
     else {
         print "$hook is not enabled\n";
     }
 }

You can also try to look at the symbols inside the httpd executable with the help of nm(1) or a similar utility. For example, if you want to see whether you enabled PERL_LOG=1 while building mod_perl, you can search for a symbol with the same name but in lowercase:

panic% nm httpd | grep perl_log
 08071724 T perl_logger

This shows that PERL_LOG=1 was enabled. But this approach will work only if you have an unstripped httpd binary. By default, make install strips the binary before installing it, thus removing the symbol names to save space. Use the without-execstrip ./configure option to prevent stripping during the make install phase. ^[4]

^[4] You might need the unstripped version for debugging reasons too.

Yet another approach that will work in most cases is to try to use the feature in question. If it wasn't configured, Apache will give an error message.

3.1.7 Using an Alternative Configuration File

By default, mod_perl provides its own copy of the Configuration file to Apache's configure utility. If you want to pass it your own version, do this:

panic% perl Makefile.PL CONFIG=Configuration.custom

where Configuration.custom is the pathname of the file relative to the Apache source tree you build against.

3.1.8 perl Makefile.PL Troubleshooting

During the configuration (perl Makefile.PL ) stage, you may encounter some of these problems. To help you avoid them, let's study them, find out why they happened, and discuss how to fix them.

3.1.8.1 A test compilation with your Makefile configuration failed...

When you see the following error during the perl Makefile.PL stage:

** A test compilation with your Makefile configuration
 ** failed. This is most likely because your C compiler
 ** is not ANSI. Apache requires an ANSI C Compiler, such
 ** as gcc. The above error message from your compiler
 ** will also provide a clue.
  Aborting!

it's possible that you have a problem with a compiler. It may be improperly installed or not installed at all. Sometimes the reason is that your Perl executable was built on a different machine, and the software installed on your machine is not the same. Generally this happens when you install prebuilt packages, such as rpm or deb. You may find that the dependencies weren't properly defined in the Perl binary package and you were allowed to install it even though some essential packages were not installed.

The most frequent pitfall is a missing gdbm library (see the next section).

But why guess, when we can actually see the real error message and understand what the real problem is? To get a real error message, edit the Apache src/Configure script. Around line 2140, you should see a line like this:

if ./helpers/TestCompile sanity; then

Add the -v option, as follows:

if ./helpers/TestCompile -v sanity; then

and try again. Now you should get a useful error message.

3.1.8.2 Missing or misconfigured libgdbm.so

On some Red Hat Linux systems, you might encounter a problem during the perl Makefile.PL stage, when Perl was installed from an rpm package built with the gdbm library, but libgdbm isn't actually installed. If this happens to you, make sure you install it before proceeding with the build process.

You can check how Perl was built by running the perl -V command:

panic% perl -V | grep libs

You should see output similar to this:

libs=-lnsl -lndbm -lgdbm -ldb -ldl -lm -lc -lposix -lcrypt

Sometimes the problem is even more obscure: you do have libgdbm installed, but it's not installed properly. Do this:

panic% ls /usr/lib/libgdbm.so*

If you get at least three lines, like we do:

lrwxrwxrwx   /usr/lib/libgdbm.so -> libgdbm.so.2.0.0
 lrwxrwxrwx   /usr/lib/libgdbm.so.2 -> libgdbm.so.2.0.0
 -rw-r--r--   /usr/lib/libgdbm.so.2.0.0

you are all set. On some installations, the libgdbm.so symbolic link is missing, so you get only:

lrwxrwxrwx   /usr/lib/libgdbm.so.2 -> libgdbm.so.2.0.0
 -rw-r--r--   /usr/lib/libgdbm.so.2.0.0

To fix this problem, add the missing symbolic link:

panic% cd /usr/lib
 panic% ln -s libgdbm.so.2.0.0 libgdbm.so

Now you should be able to build mod_perl without any problems.

Note that you might need to prepare this symbolic link as well:

lrwxrwxrwx   /usr/lib/libgdbm.so.2 -> libgdbm.so.2.0.0

with the command:

panic% ln -s libgdbm.so.2.0.0 libgdbm.so.2

Of course, if a new version of the libgdbm library was released between the moment we wrote this sentence and the moment you're reading it, you will have to adjust the version numbers. We didn't use the usual xx.xx version replacement here, to make it easier to understand how the symbolic links should be set.

About the gdbm, db, and ndbm Libraries

If you need to have the dbm library linked in, you should know that both the gdbm and db libraries offer ndbm emulation, which is the interface that Apache actually uses. So when you build mod_perl, you end up using whichever library was linked first by the Perl compilation. If you build Apache without mod_perl, you end up with whatever appears to be be your ndbm library, which will vary between systems, and especially Linux distributions. So you may have to work a bit to get both Apache and Perl to use the same library, and you are likely to have trouble copying the dbm file from one system to another or even using it after an upgrade.

3.1.8.3 Undefined reference to `PL_perl_destruct_level'

When manually building mod_perl using the shared library:

panic% cd mod_perl-1.xx
 panic% perl Makefile.PL PREP_HTTPD=1
 panic% make && make test
 panic# make install
 
 panic% cd ../apache_1.3.xx
 panic% ./configure --with-layout=RedHat --target=perlhttpd 
     --activate-module=src/modules/perl/libperl.a

you might see the following output:

gcc -c  -I./os/unix -I./include   -DLINUX=2 -DTARGET=\"perlhttpd\"
 -DUSE_HSREGEX -DUSE_EXPAT -I./lib/expat-lite `./apaci` buildmark.c
 gcc  -DLINUX=2 -DTARGET=\"perlhttpd\" -DUSE_HSREGEX -DUSE_EXPAT 
 -I./lib/expat-lite `./apaci`    \
       -o perlhttpd buildmark.o modules.o modules/perl/libperl.a 
 modules/standard/libstandard.a main/libmain.a ./os/unix/libos.a ap/libap.a 
 regex/libregex.a lib/expat-lite/libexpat.a  -lm -lcrypt
 modules/perl/libperl.a(mod_perl.o): In function `perl_shutdown':
 mod_perl.o(.text+0xf8): undefined reference to `PL_perl_destruct_level'
 mod_perl.o(.text+0x102): undefined reference to `PL_perl_destruct_level'
 mod_perl.o(.text+0x10c): undefined reference to `PL_perl_destruct_level'
 mod_perl.o(.text+0x13b): undefined reference to `Perl_av_undef'
 [more errors snipped]

This happens when Perl was built statically linked, with no shared libperl.a. Build a dynamically linked Perl (with libperl.a) and the problem will disappear.

3.2 Building mod_perl (make)

After completing the configuration, it's time to build the server by simply calling:

panic% make

The make program first compiles the source files and creates a mod_perl library file. Then, depending on your configuration, this library is either linked with httpd (statically) or not linked at all, allowing you to dynamically load it at runtime.

You should avoid putting the mod_perl source directory inside the Apache source directory, as this confuses the build process. The best choice is to put both source directories under the same parent directory.

3.2.1 What Compiler Should Be Used to Build mod_perl?

All Perl modules that use C extensions must be compiled using the compiler with which your copy of Perl was built.

When you run perl Makefile.PL, a Makefile is created. This Makefile includes the same compilation options that were used to build Perl itself. They are stored in the Config.pm module and can be displayed with the Perl -V command. All these options are reapplied when compiling Perl modules.

If you use a different compiler to build Perl extensions, chances are that the options this compiler uses won't be the same, or they might be interpreted in a completely different way. So the code may not compile, may dump core, or may behave in unexpected ways.

Since Perl, Apache, and third-party modules all work together under mod_perl, it's essential to use the same compiler while building each of the components.

If you compile a non-Perl component separately, you should make sure to use both the same compiler and the same options used to build Perl. You can find much of this information by running perl -V.

3.2.2 make Troubleshooting

The following errors are the ones that frequently occur during the make process when building mod_perl.

3.2.2.1 Undefined reference to `Perl_newAV'

This and similar error messages may show up during the make process. Generally it happens when you have a broken Perl installation. If it's installed from a broken rpm or another precompiled binary package, build Perl from source or use another properly built binary package. Run perl -V to learn what version of Perl you are using and other important details.

3.2.2.2 Unrecognized format specifier for...

This error is usually reported due to the problems with some versions of the SFIO library. Try to use the latest version to get around this problem or, if you don't really need SFIO, rebuild Perl without this library.

3.3 Testing the Server (make test)

After building the server, it's a good idea to test it throughly by calling:

panic% make test

Fortunately, mod_perl comes with a big collection of tests, which attempt to exercise all the features you asked for at the configuration stage. If any of the tests fails, the make test step will fail.

Running make test will start the freshly built httpd on port 8529 (an unprivileged port), running under the UID (user ID) and GID (group ID) of the perl Makefile.PL process. The httpd will be terminated when the tests are finished.

To change the default port (8529) used for the tests, do this:

panic% perl Makefile.PL PORT=xxxx

Each file in the testing suite generally includes more than one test, but when you do the testing, the program will report only how many tests were passed and the total number of tests defined in the test file. To learn which ones failed, run the tests in verbose mode by using the TEST_VERBOSE parameter:

panic% make test TEST_VERBOSE=1

As of mod_perl v1.23, you can use the environment variables APACHE_USER and APACHE_GROUP to override the default User and Group settings in the httpd.conf file used for make test. These two variables should be set before the Makefile is created to take effect during the testing stage. For example, if you want to set them to httpd, you can do the following in the Bourne-style shell:

panic% export APACHE_USER=httpd
 panic% export APACHE_GROUP=httpd
 panic% perl Makefile.PL ...

3.3.1 Manual Testing

Tests are invoked by running the ./TEST script located in the ./t directory. Use the -v option for verbose tests. You might run an individual test like this:

panic% perl t/TEST -v modules/file.t

or all tests in a test subdirectory:

panic% perl t/TEST modules

The TEST script starts the server before the test is executed. If for some reason it fails to start, use make start_httpd to start it manually:

panic% make start_httpd

To shut down Apache when the testing is complete, use make kill_httpd:

panic% make kill_httpd

3.3.2 make test Troubleshooting

The following sections cover problems that you may encounter during the testing stage.

3.3.2.1 make test fails

make test requires Apache to be running already, so if you specified NO_HTTPD=1 during the perl Makefile.PL stage, you'll have to build httpd independently before running make test. Go to the Apache source tree and run make, then return to the mod_perl source tree and continue with the server testing.

If you get an error like this:

still waiting for server to warm up...............not ok

you may want to examine the t/logs/error_log file, where all the make test-stage errors are logged. If you still cannot find the problem or this file is completely empty, you may want to run the test with strace (or truss) in the following way (assumming that you are located in the root directory of the mod_perl source tree):

panic% make start_httpd
 panic% strace -f -s1024 -o strace.out -p `cat t/logs/httpd.pid` &
 panic% make run_tests
 panic% make kill_httpd

where the strace -f option tells strace to trace child processes as they are created, -s1024 allows trace strings of a maximum of 1024 characters to be printed (it's 32 by default), -o gives the name of the file to which the output should be written, -p supplies the PID of the parent process, and & puts the job in the background.

When the tests are complete, you can examine the generated strace.out file and hopefully find the problem. We talk about creating and analyzing trace outputs in Chapter 21.

3.3.2.2 mod_perl.c is incompatible with this version of Apache

If you had a stale Apache header layout in one of the include paths during the build process, you will see the message "mod_perl.c is incompatible with this version of Apache" when you try to execute httpd. Find the file ap_mmn.h using find, locate, or another utility. Delete this file and rebuild Apache. The Red Hat Linux distribution usually installs it in /usr/local/include.

Before installing mod_perl-enabled Apache from scratch, it's a good idea to remove all the pre-installed Apache modules, and thus save the trouble of looking for files that mess up the build process. For example, to remove the precompiled Apache installed as a Red Hat Package Manager (RPM) package, as root you should do:

panic# rpm -e apache

There may be other RPM packages that depend on the Apache RPM package. You will be notified about any other dependent packages, and you can decide whether to delete them, too. You can always supply the nodeps option to tell the RPM manager to ignore the dependencies.

apt users would do this instead:

panic# apt-get remove apache

3.3.2.3 make test......skipping test on this platform

make test may report some tests as skipped. They are skipped because you are missing the modules that are needed for these tests to pass. You might want to peek at the contents of each test; you will find them all in the ./t directory. It's possible that you don't need any of the missing modules to get your work done, in which case you shouldn't worry that the tests are skipped.

If you want to make sure that all tests pass, you will need to figure out what modules are missing from your installation. For example, if you see:

modules/cookie......skipping test on this platform

you may want to install the Apache::Cookie module. If you see:

modules/request.....skipping test on this platform

Apache::Request is missing.^[5] If you see:

^[5] Apache::Cookie and Apache::Request are both part of the libapreq distribution.

modules/psections...skipping test on this platform

Devel::Symdump and Data::Dumper are needed.

Chances are that all of these will be installed if you use CPAN.pm to install Bundle::Apache. We talk about CPAN installations later in this chapter.

3.3.2.4 make test fails due to misconfigured localhost entry

The make test suite uses localhost to run the tests that require a network. Make sure you have this entry in /etc/hosts:

127.0.0.1       localhost.localdomain   localhost

Also make sure you have the loopback device lo configured. If you aren't sure, run:

panic% /sbin/ifconfig lo

This will tell you whether the loopback device is configured.

3.4 Installation (make install)

After testing the server, the last step is to install it. First install all the Perl files (usually as root):

panic# make install

Then go to the Apache source tree and complete the Apache installation (installing the configuration files, httpd, and utilities):

panic# cd ../apache_1.3.xx
 panic# make install

Of course, if you have used the APACHE_PREFIX option as explained earlier in this chapter, you can skip this step.

Now the installation should be considered complete. You may now configure your server and start using it.

3.4.1 Manually Building a mod_perl-Enabled Apache

If you want to build httpd separately from mod_perl, you should use the NO_HTTPD=1 option during the perl Makefile.PL (mod_perl build) stage. Then you will have to configure various things by hand and proceed to build Apache. You shouldn't run perl Makefile.PL before following the steps described in this section.

If you choose to manually build mod_perl, there are three things you may need to set up before the build stage:

mod_perl's Makefile

When perl Makefile.PL is executed, $APACHE_SRC/modules/perl/Makefile may need to be modified to enable various options (e.g., ALL_HOOKS=1).

Optionally, instead of tweaking the options during the perl Makefile.PL stage, you can edit mod_perl-1.xx/src/modules/perl/Makefile before running perl Makefile.PL.

Configuration

Add the following to apache_1.3.xx/src/Configuration:

AddModule modules/perl/libperl.a

We suggest you add this entry at the end of the Configuration file if you want your callback hooks to have precedence over core handlers.

Add the following to EXTRA_LIBS:

EXTRA_LIBS=`perl -MExtUtils::Embed -e ldopts`

Add the following to EXTRA_CFLAGS:

EXTRA_CFLAGS=`perl -MExtUtils::Embed -e ccopts`

mod_perl source files

Return to the mod_perl directory and copy the mod_perl source files into the Apache build directory:

panic% cp -r src/modules/perl apache_1.3.xx/src/modules/

When you are done with the configuration parts, run:

panic% perl Makefile.PL NO_HTTPD=1 DYNAMIC=1  EVERYTHING=1 \
     APACHE_SRC=../apache_1.3.xx/src

DYNAMIC=1 enables a build of the shared mod_perl library. Add other options if required.

panic# make install

Now you may proceed with the plain Apache build process. Note that in order for your changes to the apache_1.3.xx/src/Configuration file to take effect, you must run apache_1.3.xx/src/Configure instead of the default apache_1.3.xx/configure script:

panic% cd ../apache_1.3.xx/src
 panic% ./Configure
 panic% make
 panic# make install

3.5 Installation Scenarios for Standalone mod_perl

When building mod_perl, the mod_perl C source files that have to be compiled into the httpd executable usually are copied to the subdirectory src/modules/perl/ in the Apache source tree. In the past, to integrate this subtree into the Apache build process, a lot of adjustments were done by mod_perl's Makefile.PL. Makefile.PL was also responsible for the Apache build process.

This approach is problematic in several ways. It is very restrictive and not very clean, because it assumes that mod_perl is the only third-party module that has to be integrated into Apache.

A new hybrid build environment was therefore created for the Apache side of mod_perl, to avoid these problems. It prepares only the src/modules/perl/ subtree inside the Apache source tree, without adjusting or editing anything else. This way, no conflicts can occur. Instead, mod_perl is activated later (via APACI calls when the Apache source tree is configured), and then it configures itself.

There are various ways to build Apache with the new hybrid build environment (using USE_APACI=1):

Build Apache and mod_perl together, using the default configuration.
Build Apache and mod_perl separately, allowing you to plug in other third-party Apache modules as needed.
Build mod_perl as a DSO inside the Apache source tree using APACI.
Build mod_perl as a DSO outside the Apache source tree with APXS.

3.5.1 The All-in-One Way

If your goal is just to build and install Apache with mod_perl out of their source trees, and you have no interest in further adjusting or enhancing Apache, proceed as we described in Chapter 2:

panic% tar xzvf apache_1.3.xx.tar.gz
 panic% tar xzvf mod_perl-1.xx.tar.gz
 panic% cd mod_perl-1.xx
 panic% perl Makefile.PL APACHE_SRC=../apache_1.3.xx/src \
     DO_HTTPD=1 USE_APACI=1 EVERYTHING=1
 panic% make && make test
 panic# make install
 panic# cd ../apache_1.3.xx
 panic# make install

This builds Apache statically with mod_perl, installs Apache under the default /usr/local/apache tree, and installs mod_perl into the site_perl hierarchy of your existing Perl installation.

3.5.2 Building mod_perl and Apache Separately

However, sometimes you might need more flexibility while building mod_perl. If you build mod_perl into the Apache binary (httpd) in separate steps, you'll also have the freedom to include other third-party Apache modules. Here are the steps:

Prepare the Apache source tree.

As before, first extract the distributions:

panic% tar xvzf apache_1.3.xx.tar.gz
 panic% tar xzvf mod_perl-1.xx.tar.gz

Install mod_perl's Perl side and prepare the Apache side.
Next, install the Perl side of mod_perl into the Perl hierarchy and prepare the src/modules/perl/ subdirectory inside the Apache source tree:
```
panic% cd mod_perl-1.xx
 panic% perl Makefile.PL \
     APACHE_SRC=../apache_1.3.xx/src \
     NO_HTTPD=1   \
     USE_APACI=1  \
     PREP_HTTPD=1 \
     EVERYTHING=1 \
     [...]
 panic% make
 panic# make install
```
The APACHE_SRC option sets the path to your Apache source tree, the NO_HTTPD option forces this path and only this path to be used, the USE_APACI option triggers the new hybrid build environment, and the PREP_HTTPD option forces preparation of the $APACHE_SRC/modules/perl/ tree but no automatic build.

This tells the configuration process to prepare the Apache side of mod_perl in the Apache source tree, but doesn't touch anything else in it. It then just builds the Perl side of mod_perl and installs it into the Perl installation hierarchy.

Note that if you use PREP_HTTPD as described above, to complete the build you must go into the Apache source directory and run make and make install.
Prepare other third-party modules.
Now you have a chance to prepare any other third-party modules you might want to include in Apache. For instance, you can build PHP separately, as you did with mod_perl.
Build the Apache package.
Now it's time to build Apache, including the Apache side of mod_perl and any other third-party modules you've prepared:
```
panic% cd apache_1.3.xx
 panic% ./configure \
     --prefix=/path/to/install/of/apache \
     --activate-module=src/modules/perl/libperl.a \
     [...]
 panic% make
 panic# make install
```
You must use the prefix option if you want to change the default target directory of the Apache installation. The activate-module option activates mod_perl for the configuration process and thus also for the build process. If you choose prefix=/usr/share/apache, the Apache directory tree will be installed in /usr/share/apache.

If you add other third-party components, such as PHP, include a separate activate-module option for each of them. (See the module's documentation for the actual path to which activate-module should point.) For example, for mod_php4:
```
--activate-module=src/modules/php4/libphp4.a
```
Note that the files activated by activate-module do not exist at this time. They will be generated during compilation.

You may also want to go back to the mod_perl source tree and run make test (to make sure that mod_perl is working) before running make install inside the Apache source tree.

For more detailed examples on building mod_perl with other components, see Section 3.6.

3.5.3 When DSOs Can Be Used

If you want to build mod_perl as a DSO, you must make sure that Perl was built with the system's native malloc( ). If Perl was built with its own malloc( ) and -Dbincompat5005, it pollutes the main httpd program with free and malloc symbols. When httpd starts or restarts, any references in the main program to free and malloc become invalid, causing memory leaks and segfaults.

Notice that mod_perl's build system warns about this problem.

With Perl 5.6.0+ this pollution can be prevented by using -Ubincompat5005 or -Uusemymalloc for any version of Perl. However, there's a chance that -Uusemymalloc might hurt performance on your platform, so -Ubincompat5005 is likely a better choice.

If you get the following reports with Perl version 5.6.0+:

% perl -V:usemymalloc
 usemymalloc='y';
 % perl -V:bincompat5005
 bincompat5005='define';

rebuild Perl with -Ubincompat5005.

For pre-5.6.x Perl versions, if you get:

% perl -V:usemymalloc
 usemymalloc='y';

rebuild Perl with -Uusemymalloc.

Now rebuild mod_perl.

3.5.4 Building mod_perl as a DSO via APACI

We have already mentioned that the new mod_perl build environment (with USE_APACI) is a hybrid. What does that mean? It means, for instance, that you can use the same src/modules/perl/ configuration to build mod_perl as a DSO or not, without having to edit any files. To build libperl.so, just add a single option, depending on which method you used to build mod_perl.

libperl.so and libperl.a

The static mod_perl library is called libperl.a, and the shared mod_perl library is called libperl.so . Of course, libmodperl would have been a better prefix, but libperl was used because of prehistoric Apache issues. Be careful that you don't confuse mod_perl's libperl.a and libperl.so files with the ones that are built with the standard Perl installation.

If you choose the "standard" all-in-one way of building mod_perl, add:

USE_DSO=1

to the perl Makefile.PL options.

If you choose to build mod_perl and Apache separately, add:

--enable-shared=perl

to Apache's configure options when you build Apache.

As you can see, whichever way you build mod_perl and Apache, only one additional option is needed to build mod_perl as a DSO. Everything else is done automatically: mod_so is automatically enabled, the Makefiles are adjusted, and the install target from APACI installs libperl.so into the Apache installation tree. Additionally, the LoadModule and AddModule directives (which dynamically load and insert mod_perl into httpd) are automatically added to httpd.conf.

3.5.5 Building mod_perl as a DSO via APXS

We've seen how to build mod_perl as a DSO inside the Apache source tree, but there is a nifty alternative: building mod_perl as a DSO outside the Apache source tree via the new Apache 1.3 support tool called APXS. The advantage is obvious: you can extend an already installed Apache with mod_perl even if you don't have the sources (for instance, you may have installed an Apache binary package from your vendor or favorite distribution).

Here are the build steps:

panic% tar xzvf mod_perl-1.xx.tar.gz
 panic% cd mod_perl-1.xx
 panic% perl Makefile.PL \
     USE_APXS=1 \
     WITH_APXS=/path/to/bin/apxs \
     EVERYTHING=1 \
     [...]
 panic% make && make test
 panic# make install

This will build the DSO libperl.so outside the Apache source tree and install it into the existing Apache hierarchy.

4.1 Apache Configuration

Apache configuration can be confusing. To minimize the number of things that can go wrong, it's a good idea to first configure Apache itself without mod_perl. So before we go into mod_perl configuration, let's look at the basics of Apache itself.

4.1.1 Configuration Files

Prior to Version 1.3.4, the default Apache installation used three configuration files: httpd.conf, srm.conf, and access.conf. Although there were historical reasons for having three separate files (dating back to the NCSA server), it stopped mattering which file you used for what a long time ago, and the Apache team finally decided to combine them. Apache Versions 1.3.4 and later are distributed with the configuration directives in a single file, httpd.conf. Therefore, whenever we mention a configuration file, we are referring to httpd.conf.

By default, httpd.conf is installed in the conf directory under the server root directory. The default server root is /usr/local/apache/ on many Unix platforms, but it can be any directory of your choice (within reason). Users new to Apache and mod_perl will probably find it helpful to keep to the directory layouts we use in this book.

There is also a special file called .htaccess , used for per-directory configuration. When Apache tries to access a file on the filesystem, it will first search for .htaccess files in the requested file's parent directories. If found, Apache scans .htaccess for further configuration directives, which it then applies only to that directory in which the file was found and its subdirectories. The name .htaccess is confusing, because it can contain almost any configuration directives, not just those related to resource access control. Note that if the following directive is in httpd.conf:

<Directory />
     AllowOverride None
 </Directory>

Apache will not look for .htaccess at all unless AllowOverride is set to a value other than None in a more specific <Directory> section.

.htaccess can be renamed by using the AccessFileName directive. The following example configures Apache to look in the target directory for a file called .acl instead of .htaccess:

AccessFileName .acl

However, you must also make sure that this file can't be accessed directly from the Web, or else you risk exposing your configuration. This is done automatically for .ht* files by Apache, but for other files you need to use:

<Files .acl>
     Order Allow,Deny
     Deny from all
 </Files>

Another often-mentioned file is the startup file, usually named startup.pl. This file contains Perl code that will be executed at server startup. We'll discuss the startup.pl file in greater detail later in this chapter, in Section 4.3.

Beware of editing httpd.conf without understanding all the implications. Modifying the configuration file and adding new directives can introduce security problems and have performance implications. If you are going to modify anything, read through the documentation beforehand. The Apache distribution comes with an extensive configuration manual. In addition, each section of the distributed configuration file includes helpful comments explaining how each directive should be configured and what the default values are.

If you haven't moved Apache's directories around, the installation program will configure everything for you. You can just start the server and test it. To start the server, use the apachectl utility bundled with the Apache distribution. It resides in the same directory as httpd, the Apache server itself. Execute:

panic% /usr/local/apache/bin/apachectl start

Now you can test the server, for example by accessing http://localhost/ from a browser running on the same host.

4.1.2 Configuration Directives

A basic setup requires little configuration. If you moved any directories after Apache was installed, they should be updated in httpd.conf. Here are just a couple of examples:

ServerRoot   "/usr/local/apache"
 DocumentRoot "/usr/local/apache/docs"

You can change the port to which the server is bound by editing the Port directive. This example sets the port to 8080 (the default for the HTTP protocol is 80):

Port 8080

You might want to change the user and group names under which the server will run. If Apache is started by the user root (which is generally the case), the parent process will continue to run as root, but its children will run as the user and group specified in the configuration, thereby avoiding many potential security problems. This example uses the httpd user and group:

User httpd
 Group httpd

Make sure that the user and group httpd already exist. They can be created using useradd(1) and groupadd(1) or equivalent utilities.

Many other directives may need to be configured as well. In addition to directives that take a single value, there are whole sections of the configuration (such as the <Directory> and <Location> sections) that apply to only certain areas of the web space. The httpd.conf file supplies a few examples, and these will be discussed shortly.

4.1.3 <Directory>, <Location>, and <Files> Sections

Let's discuss the basics of the <Directory>, <Location>, and <Files> sections. Remember that there is more to know about them than what we list here, and the rest of the information is available in the Apache documentation. The information we'll present here is just what is important for understanding mod_perl configuration.

Apache considers directories and files on the machine it runs on as resources. A particular behavior can be specified for each resource; that behavior will apply to every request for information from that particular resource.

Directives in <Directory> sections apply to specific directories on the host machine, and those in <Files> sections apply only to specific files (actually, groups of files with names that have something in common). <Location> sections apply to specific URIs. Locations are given relative to the document root, whereas directories are given as absolute paths starting from the filesystem root (/). For example, in the default server directory layout where the server root is /usr/local/apache and the document root is /usr/local/apache/htdocs, files under the /usr/local/apache/htdocs/pub directory can be referred to as:

<Directory /usr/local/apache/htdocs/pub>
 </Directory>

or alternatively (and preferably) as:

<Location /pub>
 </Location>

Exercise caution when using <Location> under Win32. The Windows family of operating systems are case-insensitive. In the above example, configuration directives specified for the location /pub on a case-sensitive Unix machine will not be applied when the request URI is /Pub. When URIs map to existing files, such as Apache::Registry scripts, it is safer to use the <Directory> or <Files> directives, which correctly canonicalize filenames according to local filesystem semantics.

It is up to you to decide which directories on your host machine are mapped to which locations. This should be done with care, because the security of the server may be at stake. In particular, essential system directories such as /etc/ shouldn't be mapped to locations accessible through the web server. As a general rule, it might be best to organize everything accessed from the Web under your ServerRoot, so that it stays organized and you can keep track of which directories are actually accessible.

Locations do not necessarily have to refer to existing physical directories, but may refer to virtual resources that the server creates upon a browser request. As you will see, this is often the case for a mod_perl server.

When a client (browser) requests a resource (URI plus optional arguments) from the server, Apache determines from its configuration whether or not to serve the request, whether to pass the request on to another server, what (if any) authentication and authorization is required for access to the resource, and which module(s) should be invoked to generate the response.

For any given resource, the various sections in the configuration may provide conflicting information. Consider, for example, a <Directory> section that specifies that authorization is required for access to the resource, and a <Files> section that says that it is not. It is not always obvious which directive takes precedence in such cases. This can be a trap for the unwary.

4.1.3.1 <Directory directoryPath> ... </Directory>

Scope: Can appear in server and virtual host configurations.

<Directory> and </Directory> are used to enclose a group of directives that will apply to only the named directory and its contents, including any subdirectories. Any directive that is allowed in a directory context (see the Apache documentation) may be used.

The path given in the <Directory> directive is either the full path to a directory, or a string containing wildcard characters (also called globs). In the latter case, ? matches any single character, * matches any sequence of characters, and [ ] matches character ranges. These are similar to the wildcards used by sh and similar shells. For example:

<Directory /home/httpd/docs/foo[1-2]>
     Options Indexes
 </Directory>

will match /home/httpd/docs/foo1 and /home/httpd/docs/foo2. None of the wildcards will match a / character. For example:

<Directory /home/httpd/docs>
     Options Indexes
 </Directory>

matches /home/httpd/docs and applies to all its subdirectories.

Matching a regular expression is done by using the <DirectoryMatch regex> ... </DirectoryMatch> or <Directory ~ regex> ... </Directory> syntax. For example:

<DirectoryMatch /home/www/.*/public>
     Options Indexes
 </DirectoryMatch>

will match /home/www/foo/public but not /home/www/foo/private. In a regular expression, .* matches any character (represented by .) zero or more times (represented by *). This is entirely different from the shell-style wildcards used by the <Directory> directive. They make it easy to apply a common configuration to a set of public directories. As regular expressions are more flexible than globs, this method provides more options to the experienced user.

If multiple (non-regular expression) <Directory> sections match the directory (or its parents) containing a document, the directives are applied in the order of the shortest match first, interspersed with the directives from any .htaccess files. Consider the following configuration:

<Directory />
     AllowOverride None
 </Directory>
 
 <Directory /home/httpd/docs/>
     AllowOverride FileInfo
 </Directory>

Let us detail the steps Apache goes through when it receives a request for the file /home/httpd/docs/index.html:

Apply the directive AllowOverride None (disabling .htaccess files).
Apply the directive AllowOverride FileInfo for the directory /home/httpd/docs/ (which now enables .htaccess in /home/httpd/docs/ and its subdirectories).
Apply any directives in the group FileInfo, which control document types (AddEncoding, AddLanguage, AddType, etc.see the Apache documentation for more information) found in /home/httpd/docs/.htaccess.

4.1.3.2 <Files filename > ... </Files>

Scope: Can appear in server and virtual host configurations, as well as in .htaccess files.

The <Files> directive provides access control by filename and is comparable to the <Directory> and <Location> directives. <Files> should be closed with the corresponding </Files>. The directives specified within this section will be applied to any object with a basename matching the specified filename. (A basename is the last component of a path, generally the name of the file.)

<Files> sections are processed in the order in which they appear in the configuration file, after the <Directory> sections and .htaccess files are read, but before <Location> sections. Note that <Files> can be nested inside <Directory> sections to restrict the portion of the filesystem to which they apply. However, <Files> cannot be nested inside <Location> sections.

The filename argument should include a filename or a wildcard string, where ? matches any single character and * matches any sequence of characters, just as with <Directory> sections. Extended regular expressions can also be used, placing a tilde character (~) between the directive and the regular expression. The regular expression should be in quotes. The dollar symbol ($) refers to the end of the string. The pipe character (|) indicates alternatives, and parentheses (()) can be used for grouping. Special characters in extended regular expressions must be escaped with backslashes (\). For example:

<Files ~ "\.(pl|cgi)$">
     SetHandler perl-script
     PerlHandler Apache::Registry
     Options +ExecCGI
 </Files>

would match all the files ending with the .pl or .cgi extension (most likely Perl scripts). Alternatively, the <FilesMatch regex> ... </FilesMatch> syntax can be used.

Regular Expressions

There is much more to regular expressions than what we have shown you here. As a Perl programmer, learning to use regular expressions is very important, and what you can learn there will be applicable to your Apache configuration too.

See the perlretut manpage and the book Mastering Regular Expressions by Jeffrey E. F. Friedl (O'Reilly) for more information.

4.1.3.3 <Location URI> ... </Location>

Scope: Can appear in server and virtual host configurations.

The <Location> directive provides for directive scope limitation by URI. It is similar to the <Directory> directive and starts a section that is terminated with the </Location> directive.

<Location> sections are processed in the order in which they appear in the configuration file, after the <Directory> sections, .htaccess files, and <Files> sections have been interpreted.

The <Location> section is the directive that is used most often with mod_perl.

Note that URIs do not have to refer to real directories or files within the filesystem at all; <Location> operates completely outside the filesystem. Indeed, it may sometimes be wise to ensure that <Location>s do not match real paths, to avoid confusion.

The URI may use wildcards. In a wildcard string, ? matches any single character, * matches any sequences of characters, and [ ] groups characters to match. For regular expression matches, use the <LocationMatch regex> ... </LocationMatch> syntax.

The <Location> functionality is especially useful when combined with the SetHandler directive. For example, to enable server status requests (via mod_status) but allow them only from browsers at *.example.com, you might use:

<Location /status>
     SetHandler server-status
     Order Deny,Allow
     Deny from all
     Allow from .example.com
 </Location>

As you can see, the /status path does not exist on the filesystem, but that doesn't matter because the filesystem isn't consulted for this requestit's passed on directly to mod_status.

4.1.4 Merging <Directory>, <Location>, and <Files> Sections

When configuring the server, it's important to understand the order in which the rules of each section are applied to requests. The order of merging is:

<Directory> (except for regular expressions) and .htaccess are processed simultaneously, with the directives in .htaccess overriding <Directory>.
<DirectoryMatch> and <Directory ~ > with regular expressions are processed next.
<Files> and <FilesMatch> are processed simultaneously.
<Location> and <LocationMatch> are processed simultaneously.

Apart from <Directory>, each group is processed in the order in which it appears in the configuration files. <Directory>s (group 1 above) are processed in order from the shortest directory component to the longest (e.g., first / and only then /home/www). If multiple <Directory> sections apply to the same directory, they are processed in the configuration file order.

Sections inside <VirtualHost> sections are applied as if you were running several independent servers. The directives inside one <VirtualHost> section do not interact with directives in other <VirtualHost> sections. They are applied only after processing any sections outside the virtual host definition. This allows virtual host configurations to override the main server configuration.

If there is a conflict, sections found later in the configuration file override those that come earlier.

4.1.5 Subgrouping of <Directory>, <Location>, and <Files> Sections

Let's say that you want all files to be handled the same way, except for a few of the files in a specific directory and its subdirectories. For example, say you want all the files in /home/httpd/docs to be processed as plain files, but any files ending with .html and .txt to be processed by the content handler of the Apache::Compress module (assuming that you are already running a mod_perl server):

<Directory /home/httpd/docs>
     <FilesMatch "\.(html|txt)$">
         PerlHandler +Apache::Compress
     </FilesMatch>
 </Directory>

The + before Apache::Compress tells mod_perl to load the Apache::Compress module before using it, as we will see later.

Using <FilesMatch>, it is possible to embed sections inside other sections to create subgroups that have their own distinct behavior. Alternatively, you could also use a <Files> section inside an .htaccess file.

Note that you can't put <Files> or <FilesMatch> sections inside a <Location> section, but you can put them inside a <Directory> section.

4.1.6 Options Directive Merging

Normally, if multiple Options directives apply to a directory, the most specific one is taken completely; the options are not merged.

However, if all the options on the Options directive are preceded by either a + or - symbol, the options are merged. Any options preceded by + are added to the options currently active, and any options preceded by - are removed.

For example, without any + or - symbols:

<Directory /home/httpd/docs>
     Options Indexes FollowSymLinks
 </Directory>
 <Directory /home/httpd/docs/shtml>
     Options Includes
 </Directory>

Indexes and FollowSymLinks will be set for /home/httpd/docs/, but only Includes will be set for the /home/httpd/docs/shtml/ directory. However, if the second Options directive uses the + and - symbols:

<Directory /home/httpd/docs>
     Options Indexes FollowSymLinks
 </Directory>
 <Directory /home/httpd/docs/shtml>
     Options +Includes -Indexes
 </Directory>

then the options FollowSymLinks and Includes will be set for the /home/httpd/docs/shtml/ directory.

4.1.7 MinSpareServers, MaxSpareServers, StartServers, MaxClients, and MaxRequestsPerChild

MinSpareServers, MaxSpareServers, StartServers, and MaxClients are standard Apache configuration directives that control the number of servers being launched at server startup and kept alive during the server's operation. When Apache starts, it spawns StartServers child processes. Apache makes sure that at any given time there will be at least MinSpareServers but no more than MaxSpareServers idle servers. However, the MinSpareServers rule is completely satisfied only if the total number of live servers is no bigger than MaxClients.

MaxRequestsPerChild lets you specify the maximum number of requests to be served by each child. When a process has served MaxRequestsPerChild requests, the parent kills it and replaces it with a new one. There may also be other reasons why a child is killed, so each child will not necessarily serve this many requests; however, each child will not be allowed to serve more than this number of requests. This feature is handy to gain more control of the server, and especially to avoid child processes growing too big (RAM-wise) under mod_perl.

These five directives are very important for getting the best performance out of your server. The process of tuning these variables is described in great detail in Chapter 11.

4.2 mod_perl Configuration

When you have tested that the Apache server works on your machine, it's time to configure the mod_perl part. Although some of the configuration directives are already familiar to you, mod_perl introduces a few new ones.

It's a good idea to keep all mod_perl-related configuration at the end of the configuration file, after the native Apache configuration directives, thus avoiding any confusion.

To ease maintenance and to simplify multiple-server installations, the mod_perl-enabled Apache server configuration system provides several alternative ways to keep your configuration directives in separate places. The Include directive in httpd.conf lets you include the contents of other files, just as if the information were all contained in httpd.conf. This is a feature of Apache itself. For example, placing all mod_perl-related configuration in a separate file named conf/mod_perl.conf can be done by adding the following directive to httpd.conf:

Include conf/mod_perl.conf

If you want to include this configuration conditionally, depending on whether your Apache has been compiled with mod_perl, you can use the IfModule directive :

<IfModule mod_perl.c>
   Include conf/mod_perl.conf
 </IfModule>

mod_perl adds two more directives. <Perl> sections allow you to execute Perl code from within any configuration file at server startup time. Additionally, any file containing a Perl program can be executed at server startup simply by using the PerlRequire or PerlModule directives, as we will show shortly.

4.2.1 Alias Configurations

For many reasons, a server can never allow access to its entire directory hierarchy. Although there is really no indication of this given to the web browser, every path given in a requested URI is therefore a virtual path; early in the processing of a request, the virtual path given in the request must be translated to a path relative to the filesystem root, so that Apache can determine what resource is really being requested. This path can be considered to be a physical path, although it may not physically exist.

For instance, in mod_perl systems, you may intend that the translated path does not physically exist, because your module responds when it sees a request for this non-existent path by sending a virtual document. It creates the document on the fly, specifically for that request, and the document then vanishes. Many of the documents you see on the Web (for example, most documents that change their appearance depending on what the browser asks for) do not physically exist. This is one of the most important features of the Web, and one of the great powers of mod_perl is that it allows you complete flexibility to create virtual documents.

The ScriptAlias and Alias directives provide a mapping of a URI to a filesystem directory. The directive:

Alias /foo /home/httpd/foo

will map all requests starting with /foo to the files starting with /home/httpd/foo/. So when Apache receives a request to http://www.example.com/foo/test.pl, the server will map it to the file test.pl in the directory /home/httpd/foo/.

Additionally, ScriptAlias assigns all the requests that match the specified URI (i.e., /cgi-bin) to be executed by mod_cgi.

ScriptAlias /cgi-bin /home/httpd/cgi-bin

is actually the same as:

Alias /cgi-bin /home/httpd/cgi-bin
 <Location /cgi-bin>
     SetHandler cgi-script
     Options +ExecCGI
 </Location>

where the SetHandler directive invokes mod_cgi. You shouldn't use the ScriptAlias directive unless you want the request to be processed under mod_cgi. Therefore, when configuring mod_perl sections, use Alias instead.

Under mod_perl, the Alias directive will be followed by a section with at least two directives. The first is the SetHandler/perl-script directive, which tells Apache to invoke mod_perl to run the script. The second directive (for example, PerlHandler) tells mod_perl which handler (Perl module) the script should be run under, and hence for which phase of the request. Later in this chapter, we discuss the available Perl*Handlers^[1] for the various request phases. A typical mod_perl configuration that will execute the Perl scripts under the Apache::Registry handler looks like this:

^[1] When we say Perl*Handler, we mean the collection of all Perl handler directives (PerlHandler, PerlAccessHandler, etc.).

Alias /perl/ /home/httpd/perl/
 <Location /perl>
     SetHandler perl-script
     PerlHandler Apache::Registry
     Options +ExecCGI
 </Location>

The last directive tells Apache to execute the file as a program, rather than return it as plain text.

When you have decided which methods to use to run your scripts and where you will keep them, you can add the configuration directive(s) to httpd.conf. They will look like those below, but they will of course reflect the locations of your scripts in your filesystem and the decisions you have made about how to run the scripts:

ScriptAlias /cgi-bin/ /home/httpd/cgi-bin/
 Alias       /perl/    /home/httpd/perl/
 <Location /perl>
     SetHandler perl-script
     PerlHandler Apache::Registry
     Options +ExecCGI
 </Location>

In the examples above, all requests issued for URIs starting with /cgi-bin will be served from the directory /home/httpd/cgi-bin/, and those starting with /perl will be served from the directory /home/httpd/perl/.

4.2.1.1 Running scripts located in the same directory under different handlers

Sometimes you will want to map the same directory to a few different locations and execute each file according to the way it was requested. For example, in the following configuration:

# Typical for plain cgi scripts:
 ScriptAlias /cgi-bin/  /home/httpd/perl/
 
 # Typical for Apache::Registry scripts:
 Alias       /perl/     /home/httpd/perl/
 
 # Typical for Apache::PerlRun scripts:
 Alias       /cgi-perl/ /home/httpd/perl/
 
 <Location /perl/>
     SetHandler perl-script
     PerlHandler Apache::Registry
     Options +ExecCGI
 </Location>
 <Location /cgi-perl/>
     SetHandler perl-script
     PerlHandler Apache::PerlRun
     Options +ExecCGI
 </Location>

the following three URIs:

http://www.example.com/perl/test.pl
 http://www.example.com/cgi-bin/test.pl
 http://www.example.com/cgi-perl/test.pl

are all mapped to the same file, /home/httpd/perl/test.pl. If test.pl is invoked with the URI prefix /perl, it will be executed under the Apache::Registry handler. If the prefix is /cgi-bin, it will be executed under mod_cgi, and if the prefix is /cgi-perl, it will be executed under the Apache::PerlRun handler.

This means that we can have all our CGI scripts located at the same place in the filesystem and call the script in any of three ways simply by changing one component of the URI (cgi-bin|perl|cgi-perl).

This technique makes it easy to migrate your scripts to mod_perl. If your script does not seem to work while running under mod_perl, in most cases you can easily call the script in straight mod_cgi mode or under Apache::PerlRun without making any script changes. Simply change the URL you use to invoke it.

Although in the configuration above we have configured all three Aliases to point to the same directory within our filesystem, you can of course have them point to different directories if you prefer.

This should just be a migration strategy, though. In general, it's a bad idea to run scripts in plain mod_cgi mode from a mod_perl-enabled serverthe extra resource consumption is wasteful. It is better to run these on a plain Apache server.

4.2.2 <Location /perl> Sections

The <Location> section assigns a number of rules that the server follows when the request's URI matches the location. Just as it is a widely accepted convention to use /cgi-bin for mod_cgi scripts, it is habitual to use /perl as the base URI of the Perl scripts running under mod_perl. Let's review the following very widely used <Location> section:

Alias /perl/ /home/httpd/perl/
 PerlModule Apache::Registry
 <Location /perl>
     SetHandler perl-script
     PerlHandler Apache::Registry
     Options +ExecCGI
     Allow from all
     PerlSendHeader On
 </Location>

This configuration causes all requests for URIs starting with /perl to be handled by the mod_perl Apache module with the handler from the Apache::Registry Perl module.

Remember the Alias from the previous section? We use the same Alias here. If you use a <Location> that does not have the same Alias, the server will fail to locate the script in the filesystem. You need the Alias setting only if the code that should be executed is located in a file. Alias just provides the URI-to-filepath translation rule.

Sometimes there is no script to be executed. Instead, a method in a module is being executed, as with /perl-status, the code for which is stored in an Apache module. In such cases, you don't need Alias settings for these <Location>s.

PerlModule is equivalent to Perl's native use( ) function call. We use it to load the Apache::Registry module, later used as a handler in the <Location> section.

Now let's go through the directives inside the <Location> section:

SetHandler perl-script: The SetHandler directive assigns the mod_perl Apache module to handle the content generation phase.
PerlHandler Apache::Registry: The PerlHandler directive tells mod_perl to use the Apache::Registry Perl module for the actual content generation.
Options +ExecCGI: Options +ExecCGI ordinarily tells Apache that it's OK for the directory to contain CGI scripts. In this case, the flag is required by Apache::Registry to confirm that you really know what you're doing. Additionally, all scripts located in directories handled by Apache::Registry must be executable, another check against wayward non-script files getting left in the directory accidentally. If you omit this option, the script either will be rendered as plain text or will trigger a Save As dialog, depending on the client. ^[2]

^[2] You can use Apache::RegistryBB to skip this and a few other checks.
Allow from all: The Allow directive is used to set access control based on the client's domain or IP adress. The from all setting allows any client to run the script.
PerlSendHeader On: The PerlSendHeader On line tells mod_perl to intercept anything that looks like a header line (such as Content-Type: text/html) and automatically turn it into a correctly formatted HTTP header the way mod_cgi does. This lets you write scripts without bothering to call the request object's send_http_header( ) method, but it adds a small overhead because of the special handling.

If you use CGI.pm's header( ) function to generate HTTP headers, you do not need to activate this directive, because CGI.pm detects that it's running under mod_perl and calls send_http_header( ) for you.

You will want to set PerlSendHeader Off for non-parsed headers (nph) scripts and generate all the HTTP headers yourself. This is also true for mod_perl handlers that send headers with the send_http_header( ) method, because having PerlSendHeader On as a server-wide configuration option might be a performance hit.

</Location>: </Location> closes the <Location> section definition.

Overriding <Location> Settings

Suppose you have:

<Location /foo> SetHandler perl-script PerlHandler Book::Module </Location>

To remove a mod_perl handler setting from a location beneath a location where a handler is set (e.g., /foo/bar), just reset the handler like this:

<Location /foo/bar> SetHandler default-handler </Location>

Now all requests starting with /foo/bar will be served by Apache's default handler, which serves the content directly.

4.2.3 PerlModule and PerlRequire

As we saw earlier, a module should be loaded before its handler can be used. PerlModule and PerlRequire are the two mod_perl directives that are used to load modules and code. They are almost equivalent to Perl's use( ) and require( ) functions (respectively) and are called from the Apache configuration file. You can pass one or more module names as arguments to PerlModule:

PerlModule Apache::DBI CGI DBD::Mysql

Generally, modules are preloaded from the startup script, which is usually called startup.pl. This is a file containing Perl code that is executed through the PerlRequire directive. For example:

PerlRequire  /home/httpd/perl/lib/startup.pl

A PerlRequire filename can be absolute or relative to the ServerRoot or to a path in @INC.

As with any file with Perl code that gets use( )d or require( )d, it must return a true value. To ensure that this happens, don't forget to add 1; at the end of startup.pl.

4.2.4 Perl*Handlers

As mentioned in Chapter 1, Apache specifies 11 phases of the request loop. In order of processing, they are: Post-read-request, URI translation, header parsing, access control, authentication, authorization, MIME type checking, fixup, response (also known as the content handling phase), logging, and finally cleanup. These are the stages of a request where the Apache API allows a module to step in and do something. mod_perl provides dedicated configuration directives for each of these stages:

PerlPostReadRequestHandler
 PerlInitHandler
 PerlTransHandler
 PerlHeaderParserHandler
 PerlAccessHandler
 PerlAuthenHandler
 PerlAuthzHandler
 PerlTypeHandler
 PerlFixupHandler
 PerlHandler
 PerlLogHandler
 PerlCleanupHandler

These configuration directives usually are referred to as Perl*Handler directives. The * in Perl*Handler is a placeholder to be replaced by something that identifies the phase to be handled. For example, PerlLogHandler is the Perl handler that (fairly obviously) handles the logging phase.

In addition, mod_perl adds a few more stages that happen outside the request loop:

PerlChildInitHandler

Allows your modules to initialize data structures during the startup of the child process.

PerlChildExitHandler

Allows your modules to clean up during the child process shutdown.

PerlChildInitHandler and PerlChildExitHandler might be used, for example, to allocate and deallocate system resources, pre-open and close database connections, etc. They do not refer to parts of the request loop.

PerlRestartHandler

Allows you to specify a routine that is called when the server is restarted. Since Apache always restarts itself immediately after it starts, this is a good phase for doing various initializations just before the child processes are spawned.

PerlDispatchHandler

Can be used to take over the process of loading and executing handler code. Instead of processing the Perl*Handler directives directly, mod_perl will invoke the routine pointed to by PerlDispatchHandler and pass it the Apache request object and a second argument indicating the handler that would ordinarily be invoked to process this phase. So for example, you can write a PerlDispatchHandler handler with a logic that will allow only specific code to be executed.

Since most mod_perl applications need to handle only the response phase, in the default compilation, most of the Perl*Handlers are disabled. During the perl Makefile.PL mod_perl build stage, you must specify whether or not you will want to handle parts of the request loop other than the usual content generation phase. If this is the case, you need to specify which phases, or build mod_perl with the option EVERYTHING=1, which enables them all. All the build options are covered in detail in Chapter 3.

Note that it is mod_perl that recognizes these directives, not Apache. They are mod_perl directives, and an ordinary Apache server will not recognize them. If you get error messages about these directives being "perhaps mis-spelled," it is a sure sign that the appropriate part of mod_perl (or the entire mod_perl module!) is missing from your server.

All <Location>, <Directory>, and <Files> sections contain a physical path specification. Like PerlChildInitHandler and PerlChildExitHandler, the directives PerlPostReadRequestHandler and PerlTransHandler cannot be used in these sections, nor in .htaccess files, because the path translation isn't completed and a physical path isn't known until the end of the translation (PerlTransHandler) phase.

PerlInitHandler is more of an alias; its behavior changes depending on where it is used. In any case, it is the first handler to be invoked when serving a request. If found outside any <Location>, <Directory>, or <Files> section, it is an alias for PerlPostReadRequestHandler. When inside any such section, it is an alias for PerlHeaderParserHandler.

Starting with the header parsing phase, the requested URI has been mapped to a physical server pathname, and thus PerlHeaderParserHandler can be used to match a <Location>, <Directory>, or <Files> configuration section, or to process an .htaccess file if such a file exists in the specified directory in the translated path.

PerlDispatchHandler, PerlCleanupHandler, and PerlRestartHandler do not correspond to parts of the Apache API, but allow you to fine-tune the mod_perl API. They are specified outside configuration sections.

The Apache documentation and the book Writing Apache Modules with Perl and C (O'Reilly) provide in-depth information on the request phases.

4.2.5 The handler( ) Subroutine

By default, the mod_perl API expects a subroutine named handler( ) to handle the request in the registered Perl*Handler module. Thus, if your module implements this subroutine, you can register the handler with mod_perl by just specifying the module name. For example, to set the PerlHandler to Apache::Foo::handler, the following setting would be sufficient:

PerlHandler Apache::Foo

mod_perl will load the specified module for you when it is first used. Please note that this approach will not preload the module at startup. To make sure it gets preloaded, you have three options:

You can explicitly preload it with the PerlModule directive:
```
PerlModule Apache::Foo
```
You can preload it in the startup file:
```
use Apache::Foo ( );
```
You can use a nice shortcut provided by the Perl*Handler syntax:
```
PerlHandler +Apache::Foo
```
Note the leading + character. This directive is equivalent to:
```
PerlModule Apache::Foo
 <Location ..>
     ...
     PerlHandler Apache::Foo
 </Location>
```

If you decide to give the handler routine a name other than handler( ) (for example, my_handler( )), you must preload the module and explicitly give the name of the handler subroutine:

PerlModule Apache::Foo
 <Location ..>
     ...
     PerlHandler Apache::Foo::my_handler
 </Location>

This configuration will preload the module at server startup.

If a module needs to know which handler is currently being run, it can find out with the current_callback( ) method. This method is most useful to PerlDispatchHandlers that take action for certain phases only.

if ($r->current_callback eq "PerlLogHandler") {
     $r->warn("Logging request");
 }

4.2.6 Investigating the Request Phases

Imagine a complex server setup in which many different Perl and non-Perl handlers participate in the request processing, and one or more of these handlers misbehaves. A simple example is one where one of the handlers alters the request record, which breaks the functionality of other handlers. Or maybe a handler invoked first for any given phase of the process returns an unexpected OK status, thus preventing other handlers from doing their job. You can't just add debug statements to trace the offenderthere are too many handlers involved.

The simplest solution is to get a trace of all registered handlers for each phase, stating whether they were invoked and what their return statuses were. Once such a trace is available, it's much easier to look only at the players that actually participated, thus narrowing the search path down a potentially misbehaving module.

The Apache::ShowRequest module shows the phases the request goes through, displaying module participation and response codes for each phase. The content response phase is not run, but possible modules are listed as defined. To configure it, just add this snippet to httpd.conf:

<Location /showrequest>
     SetHandler perl-script
     PerlHandler +Apache::ShowRequest
 </Location>

To see what happens when you access some URI, add the URI to /showrequest. Apache::ShowRequest uses PATH_INFO to obtain the URI that should be executed. So, to run /index.html with Apache::ShowRequest, issue a request for /showrequest/index.html. For /perl/test.pl, issue a request for /showrequest/perl/test.pl.

This module produces rather lengthy output, so we will show only one section from the report generated while requesting /showrequest/index.html:

Running request for /index.html
 Request phase: post_read_request
   [snip]
 Request phase: translate_handler
    mod_perl ....................DECLINED
    mod_setenvif ................undef
    mod_auth ....................undef
    mod_access ..................undef
    mod_alias ...................DECLINED
    mod_userdir .................DECLINED
    mod_actions .................undef
    mod_imap ....................undef
    mod_asis ....................undef
    mod_cgi .....................undef
    mod_dir .....................undef
    mod_autoindex ...............undef
    mod_include .................undef
    mod_info ....................undef
    mod_status ..................undef
    mod_negotiation .............undef
    mod_mime ....................undef
    mod_log_config ..............undef
    mod_env .....................undef
    http_core ...................OK
 Request phase: header_parser
   [snip]
 Request phase: access_checker
   [snip]
 Request phase: check_user_id
   [snip]
 Request phase: auth_checker
   [snip]
 Request phase: type_checker
   [snip]
 Request phase: fixer_upper
   [snip]
 Request phase: response handler (type: text/html)
    mod_actions .................defined
    mod_include .................defined
    http_core ...................defined
 Request phase: logger
   [snip]

For each stage, we get a report of what modules could participate in the processing and whether they took any action. As you can see, the content response phase is not run, but possible modules are listed as defined. If we run a mod_perl script, the response phase looks like:

Request phase: response handler (type: perl-script)
    mod_perl ....................defined

4.2.7 Stacked Handlers

With the mod_perl stacked handlers mechanism, it is possible for more than one Perl*Handler to be defined and executed during any stage of a request.

Perl*Handler directives can define any number of subroutines. For example:

PerlTransHandler Foo::foo Bar::bar

Foo::foo( ) will be executed first and Bar::bar( ) second. As always, if the subroutine's name is handler( ), you can omit it.

With the Apache->push_handlers( ) method, callbacks (handlers) can be added to a stack at runtime by mod_perl modules.

Apache->push_handlers( ) takes the callback handler name as its first argument and a subroutine name or reference as its second. For example, let's add two handlers called my_logger1( ) and my_logger2( ) to be executed during the logging phase:

use Apache::Constants qw(:common);
 sub my_logger1 {
     #some code here
     return OK;
 }
 sub my_logger2 {
     #some other code here
     return OK;
 }
 Apache->push_handlers("PerlLogHandler", \&my_logger1);
 Apache->push_handlers("PerlLogHandler", \&my_logger2);

You can also pass a reference to an anonymous subroutine. For example:

use Apache::Constants qw(:common);
 
 Apache->push_handlers("PerlLogHandler", sub {
     print STDERR "_ _ANON_ _ called\n";
     return OK;
 });

After each request, this stack is erased.

All handlers will be called in turn, unless a handler returns a status other than OK or DECLINED.

To enable this feature, build mod_perl with:

panic% perl Makefile.PL PERL_STACKED_HANDLERS=1 [ ... ]

or:

panic% perl Makefile.PL EVERYTHING=1 [ ... ]

To test whether the version of mod_perl you're running can stack handlers, use the Apache->can_stack_handlers method. This method will return a true value if mod_perl was configured with PERL_STACKED_HANDLERS=1, and a false value otherwise.

Let's look at a few real-world examples where this method is used:

The widely used CGI.pm module maintains a global object for its plain function interface. Since the object is global, under mod_perl it does not go out of scope when the request is completed, and the DESTROY method is never called. Therefore, CGI->new arranges to call the following code if it detects that the module is used in the mod_perl environment:
```
Apache->push_handlers("PerlCleanupHandler", \&CGI::_reset_globals);
```
This function is called during the final stage of a request, resetting CGI.pm's globals before the next request arrives.
Apache::DCELogin establishes a DCE login context that must exist for the lifetime of a request, so the DCE::Login object is stored in a global variable. Without stacked handlers, users must set the following directive in the configuration file to destroy the context:
```
PerlCleanupHandler Apache::DCELogin::purge
```
This is ugly. With stacked handlers, Apache::DCELogin::handler can call from within the code:
```
Apache->push_handlers("PerlCleanupHandler", \&purge);
```
Apache::DBI, the persistent database connection module, can pre-open the connection when the child process starts via its connect_on_init( ) function. This function uses push_handlers( ) to add a PerlChildInitHandler:
```
Apache->push_handlers(PerlChildInitHandler => \&childinit);
```
Now when the new process gets the first request, it already has the database connection open.

Apache::DBI also uses push_handlers( ) to have PerlCleanupHandler handle rollbacks if its AutoCommit attribute is turned off.
PerlTransHandlers (e.g., Apache::MsqlProxy) may decide, based on the URI or some arbitrary condition, whether or not to handle a request. Without stacked handlers, users must configure it themselves.
```
PerlTransHandler Apache::MsqlProxy::translate
 PerlHandler      Apache::MsqlProxy
```
PerlHandler is never actually invoked unless translate( ) sees that the request is a proxy request ($r->proxyreq). If it is a proxy request, translate( ) sets $r->handler("perl-script"), and only then will PerlHandler handle the request. Now users do not have to specify PerlHandler Apache::MsqlProxy, because the translate( ) function can set it with push_handlers( ).

Now let's write our own example using stacked handlers. Imagine that you want to piece together a document that includes footers, headers, etc. without using SSI. The following example shows how to implement it. First we prepare the code as shown in Example 4-1.

Example 4-1. Book/Compose.pm

package Book::Compose;
 use Apache::Constants qw(OK);
 
 sub header {
     my $r = shift;
     $r->send_http_header("text/plain");
     $r->print("header text\n");
     return OK;
 }
 sub body   {
     shift->print("body text\n");
     return OK;
 }
 sub footer {
     shift->print("footer text\n");
     return OK;
 }
 1;

The code defines the package Book::Compose, imports the OK constant, and defines three subroutines: header( ) to send the header, body( ) to create and send the actual content, and finally footer( ) to add a standard footer to the page. At the end of each handler we return OK, so the next handler, if any, will be executed.

To enable the construction of the page, we now supply the following configuration:

PerlModule Book::Compose
 <Location /compose>
     SetHandler perl-script
     PerlHandler Book::Compose::header Book::Compose::body Book::Compose::footer
  </Location>

We preload the Book::Compose module and construct the PerlHandler directive by listing the handlers in the order in which they should be invoked.^[3]

^[3] It may not seem to make sense to use this example, as it would be much simpler to write a single handler to call all three subroutines. But what if the three reside in different modules that are maintained by different authors?

Finally, let's look at the technique that allows parsing the output of another PerlHandler. For example, suppose your module generates HTML responses, but you want the same content to be delivered in plain text at a different location. This is a little trickier, but consider the following:

<Location /perl>
     SetHandler perl-script
     PerlHandler Book::HTMLContentGenerator
 </Location>
 <Location /text>
     SetHandler perl-script
     PerlHandler Book::HTML2TextConvertor Book::HTMLContentGenerator
 </Location>

Notice that Book::HTML2TextConvertor is listed first. While its handler( ) will be called first, the actual code that does the conversion will run last, as we will explain in a moment. Now let's look at the sample code in Example 4-2.

Example 4-2. Book/HTML2TextConvertor.pm

package Book::HTML2TextConvertor;
 
 sub handler {
     my $r = shift;
     untie *STDOUT;
     tie *STDOUT => _ _PACKAGE_ _, $r;
 }
 
 sub TIEHANDLE {
     my($class, $r) = @_;
     bless { r => $r}, $class;
 }
 
 sub PRINT {
     my $self = shift;
     for (@_) {
         # copy it so no 'read-only value modification' will happen
         my $line = $_;
         $line =~ s/<[^>]*>//g; # strip the html <tags>
         $self->{r}->print($line);
     }
 }
 
 1;

It untie( )s STDOUT and re-tie( )s it to its own package, so that content printed to STDOUT by the previous content generator in the pipe goes through this module. In the PRINT( ) method, we attempt to strip the HTML tags. Of course, this is only an example; correct HTML stripping actually requires more than one line of code and a quite complex regular expression, but you get the idea.

4.2.8 Perl Method Handlers

If mod_perl was built with:

panic% perl Makefile.PL PERL_METHOD_HANDLERS=1 [ ... ]

or:

panic% perl Makefile.PL EVERYTHING=1 [ ... ]

it's possible to write method handlers in addition to function handlers. This is useful when you want to write code that takes advantage of inheritance. To make the handler act as a method under mod_perl, use the $$ function prototype in the handler definition. When mod_perl sees that the handler function is prototyped with $$, it'll pass two arguments to it: the calling object or a class, depending on how it was called, and the Apache request object. So you can write the handler as:

sub handler ($$) {
     my($self, $r) = @_;
     # ...
 }

The configuration is almost as usual. Just use the class name if the default method name handler( ) is used:

PerlHandler Book::SubClass

However, if you choose to use a different method name, the object-oriented notation should be used:

PerlHandler Book::SubClass->my_handler

The my_handler( ) method will then be called as a class (static) method.

Also, you can use objects created at startup to call methods. For example:

<Perl>
     use Book::SubClass;
     $Book::Global::object = Book::SubClass->new( );
 </Perl>
 ...
 PerlHandler $Book::Global::object->my_handler

In this example, the my_handler( ) method will be called as an instance method on the global object $Book::Global.

4.2.9 PerlFreshRestart

To reload PerlRequire, PerlModule, and other use( )d modules, and to flush the Apache::Registry cache on server restart, add this directive to httpd.conf:

PerlFreshRestart On

You should be careful using this setting. It used to cause trouble in older versions of mod_perl, and some people still report problems using it. If you are not sure if it's working properly, a full stop and restart of the server will suffice.

Starting with mod_perl Version 1.22, PerlFreshRestart is ignored when mod_perl is compiled as a DSO. But it almost doesn't matter, as mod_perl as a DSO will do a full tear-down (calling perl_destruct( )).^[4]

^[4] The parent process would leak several MB on each restart without calling perl_destruct( ).

4.2.10 PerlSetEnv and PerlPassEnv

In addition to Apache's SetEnv and PassEnv directives, respectively setting and passing shell environment variables, mod_perl provides its own directives: PerlSetEnv and PerlPassEnv.

If you want to globally set an environment variable for the server, you can use the PerlSetEnv directive. For example, to configure the mod_perl tracing mechanism (as discussed in Chapter 21), add this to httpd.conf:

PerlSetEnv MOD_PERL_TRACE all

This will enable full mod_perl tracing.

Normally, PATH is the only shell environment variable available under mod_perl. If you need to rely on other environment variables, you can have mod_perl make those available for your code with PerlPassEnv.

For example, to forward the environment variable HOME (which is usually set to the home of the user who has invoked the server in httpd.conf), add:

PerlPassEnv HOME

Once you set the environment variable, it can be accessed via the %ENV hash in Perl (e.g., $ENV{HOME}).

PerlSetEnv and PerlPassEnv work just like the Apache equivalents, except that they take effect in the first phase of the Apache request cycle. The standard Apache directives SetEnv and PassEnv don't affect the environment until the fixup phase, which happens much later, just before content generation. This works for CGI scripts, which aren't run before then, but if you need to set some environment variables and access them in a handler invoked before the response stage, you should use the mod_perl directives. For example, handlers that want to use an Oracle relational database during the authentication phase might need to set the following environment variable (among others) in httpd.conf:

PerlSetEnv ORACLE_HOME /share/lib/oracle/

Note that PerlSetEnv will override the environment variables that were available earlier. For example, we have mentioned that PATH is always supplied by Apache itself. But if you explicitly set:

PerlSetEnv PATH /tmp

this setting will be used instead of the one set in the shell program.

As with other configuration scoping rules, if you place PerlSetEnv or PerlPassEnv in the scope of the configuration file, it will apply everywhere (unless overridden). If placed into a <Location> section, or another section in the same group, these directives will influence only the handlers in that section.

4.2.11 PerlSetVar and PerlAddVar

PerlSetVar is another directive introduced by mod_perl. It is very similar to PerlSetEnv, but the key/value pairs are stored in an Apache::Table object and retrieved using the dir_config( ) method.

There are two ways to use PerlSetVar. The first is the usual way, as a configuration directive. For example:

PerlSetVar foo bar

The other way is via Perl code in <Perl> sections:

<Perl>
     push @{ $Location{"/"}->{PerlSetVar} }, [ foo => 'bar' ];
 </Perl>

Now we can retrieve the value of foo using the dir_config( ) method:

$foo = $r->dir_config('foo');

Note that you cannot use the following code in <Perl> sections, which we discuss later in this chapter:

<Perl>
     my %foo = (a => 0, b => 1);
     push @{ $Location{"/"}->{PerlSetVar} }, [ foo => \%foo ];
 </Perl>

All values are passed to Apache::Table as strings, so you will get a stringified reference to a hash as a value (such as "HASH(0x87a5108)"). This cannot be turned back into the original hash upon retrieval.

However, you can use the PerlAddVar directive to push more values into the variable, emulating arrays. For example:

PerlSetVar foo bar
 PerlAddVar foo bar1
 PerlAddVar foo bar2

or the equivalent:

PerlAddVar foo bar
 PerlAddVar foo bar1
 PerlAddVar foo bar2

To retrieve the values, use the $r->dir_config->get( ) method:

my @foo = $r->dir_config->get('foo');

Obviously, you can always turn an array into a hash with Perl, so you can use this directive to pass hashes as well. Consider this example:

PerlAddVar foo key1
 PerlAddVar foo value1
 PerlAddVar foo key2
 PerlAddVar foo value2

You can then retrieve the hash in this way:

my %foo = $r->dir_config->get('foo');

Make sure that you use an even number of elements if you store the retrieved values in a hash.

Passing a list or a hash via the PerlAddVar directive in a <Perl> section should be coded in this way:

<Perl>
   my %foo = (a => 0, b => 1);
   for (%foo) {
       push @{ $Location{"/"}->{PerlAddVar} }, [ foo => $_ ];
   }
 </Perl>

Now you get back the hash as before:

my %foo = $r->dir_config->get('foo');

This might not seem very practical; if you have more complex needs, think about having dedicated configuration files.

Customized configuration directives can also be created for the specific needs of a Perl module. To learn how to create these, please refer to Chapter 8 of Writing Apache Modules with Perl and C (O'Reilly), which covers this topic in great detail.

4.2.12 PerlSetupEnv

Certain Perl modules used in CGI code (such as CGI.pm) rely on a number of environment variables that are normally set by mod_cgi. For example, many modules depend on QUERY_STRING, SCRIPT_FILENAME, and REQUEST_URI. When the PerlSetupEnv directive is turned on, mod_perl provides these environment variables in the same fashion that mod_cgi does. This directive is On by default, which means that all the environment variables you are accustomed to being available under mod_cgi are also available under mod_perl.

The process of setting these environment variables adds overhead for each request, whether the variables are needed or not. If you don't use modules that rely on this behavior, you can turn it off in the general configuration and then turn it on in sections that need it (such as legacy CGI scripts):

PerlSetupEnv Off
 <Location /perl-run>
     SetHandler perl-script
     PerlHandler Apache::PerlRun
     Options +ExecCGI
     PerlSetupEnv On
 </Location>

You can use mod_perl methods to access the information provided by these environment variables (e.g., $r->path_info instead of $ENV{PATH_INFO}). For more details, see the explanation in Chapter 11.

4.2.13 PerlWarn and PerlTaintCheck

PerlWarn and PerlTaintCheck have two possible values, On and Off. PerlWarn turns warnings on and off globally to the whole server, and PerlTaintCheck controls whether the server is running with taint checking or not. These two variables are also explained in Chapter 6.

4.3 The Startup File

At server startup, before child processes are spawned, you can do much more than just preload modules. You might want to register code that will initialize a database connection for each child when it is forked, tie read-only DBM files, fill in shared caches, etc.

The startup.pl file is an ideal place to put code that should be executed when the server starts. Once you have prepared the code, load it in httpd.conf before other mod_perl configuration directives with the PerlRequire directive:

PerlRequire  /home/httpd/perl/lib/startup.pl

Be careful with the startup file. Everything run at server initialization is run with root privileges if you start the server as root (which you have to do unless you choose to run the server on an unprivileged port, numbered 1024 or higher). This means that anyone who has write access to a script or module that is loaded by PerlModule, PerlRequire, or <Perl> sections effectively has root access to the system.

4.3.1 A Sample Startup File

Let's look at a real-world startup file. The elements of the file are shown here, followed by their descriptions.

use strict;

This pragma is worth using in every script longer than half a dozen lines. It will save a lot of time and debugging later.

use lib qw(/home/httpd/lib /home/httpd/extra-lib);

This permanently adds extra directories to @INC, something that's possible only during server startup. At the end of each request's processing, mod_perl resets @INC to the value it had after the server startup. Alternatively, you can use the PERL5LIB environment variable to add extra directories to @INC.

$ENV{MOD_PERL} or die "not running under mod_perl!";

This is a sanity check. If mod_perl wasn't properly built, the server startup is aborted.

use Apache::Registry ( );
 use LWP::UserAgent ( );
 use Apache::DBI ( );
 use DBI ( );

Preload the modules that get used by Perl code serving requests. Unless you need the symbols (variables and subroutines) exported by preloaded modules to accomplish something within the startup file, don't import themit's just a waste of startup time and memory. Instead, use the empty import list ( ) to tell the import( ) function not to import anything.

use Carp ( );
 $SIG{_ _WARN_ _} = \&Carp::cluck;

This is a useful snippet to enable extended warnings logged in the error_log file. In addition to basic warnings, a trace of calls is added. This makes tracking potential problems a much easier task, since you know who called what.

The only drawback of this method is that it globally overrides the default warning handler behaviorthus, in some places it might be desirable to change the settings locally (for example, with local $^W=0, or no warnings under Perl 5.6.0 and higher). Usually warnings are turned off on production machines to prevent unnecessary clogging of the error_log file if your code is not very clean. Hence, this method is mostly useful in a development environment.

use CGI ( );
 CGI->compile(':all');

Some modules, such as CGI.pm, create their subroutines at runtime via AUTOLOAD to improve their loading time. This helps when the module includes many subroutines but only a few are actually used. (Also refer to the AutoSplit manpage.) Since the module is loaded only once with mod_perl, it might be a good idea to precompile all or some of its methods at server startup. This avoids the overhead of compilation at runtime. It also helps share more compiled code between child processes.

CGI.pm's compile( ) method performs this task. Note that compile( ) is specific to CGI.pm; other modules that implement this feature may use another name for the compilation method.

As with all modules we preload in the startup file, we don't import symbols from them because they will be lost when they go out of the file's scope.

The following code snippet makes sure that when the child process is spawned, a connection to the database is opened automatically, avoiding this performance hit on the first request:

Apache::DBI->connect_on_init
   ("DBI:mysql:database=test;host=localhost",
    "user", "password", {
                         PrintError => 1, # warn( ) on errors
                         RaiseError => 0, # don't die on error
                         AutoCommit => 1, # commit executes immediately
                        }
   );

We discuss this method in detail in Chapter 20.

The file ends with 1; so it can be successfully loaded by Perl.

The entire startup.pl file is shown in Example 4-3.

Example 4-3. startup.pl

use strict;
 
 use lib qw(/home/httpd/lib /home/httpd/extra-lib);
 $ENV{MOD_PERL} or die "not running under mod_perl!";
 
 use Apache::Registry ( );
 use LWP::UserAgent ( );
 use Apache::DBI ( );
 use DBI ( );
 
 use Carp ( );
 $SIG{_ _WARN_ _} = \&Carp::cluck;
 
 use CGI ( );
 CGI->compile(':all');
 
 Apache::DBI->connect_on_init
   ("DBI:mysql:database=test;host=localhost",
    "user", "password", {
                         PrintError => 1, # warn( ) on errors
                         RaiseError => 0, # don't die on error
                         AutoCommit => 1, # commit executes immediately
                        }
   );
 1;

4.3.2 Syntax Validation

If the startup file doesn't include any modules that require the mod_perl runtime environment during their loading, you can validate its syntax with:

panic% perl -cw /home/httpd/perl/lib/startup.pl

The -c switch tells Perl to validate only the file's syntax, and the -w switch enables warnings.

Apache::DBI is an example of a module that cannot be loaded outside of the mod_perl environment. If you try to load it, you will get the following error message:

panic% perl -MApache::DBI -c -e 1
 Can't locate object method "module" via package "Apache" 
 (perhaps you forgot to load "Apache"?) at 
 /usr/lib/perl5/site_perl/5.6.1/Apache/DBI.pm line 202.
 Compilation failed in require.
 BEGIN failed--compilation aborted.

However, Apache::DBI will work perfectly once loaded from within mod_perl.

4.3.3 What Modules Should Be Added to the Startup File

Every module loaded at server startup will be shared among the server children, saving a lot of RAM on your machine. Usually, we put most of the code we develop into modules and preload them.

You can even preload CGI scripts with Apache::RegistryLoader, as explained in Chapter 10.

4.3.4 The Confusion with use( ) in the Server Startup File

Some people wonder why they need to duplicate use Modulename in the startup file and in the script itself. The confusion arises due to misunderstanding use( ). Let's take the POSIX module as an example. When you write:

use POSIX qw(setsid);

use( ) internally performs two operations:

BEGIN {
     require POSIX;
     POSIX->import(qw(setsid));
 }

The first operation loads and compiles the module. The second calls the module's import( ) method and specifies to import the symbol setsid into the caller's namespace. The BEGIN block makes sure that the code is executed as soon as possible, before the rest of the code is even parsed. POSIX, like many other modules, specifies a default export list. This is an especially extensive list, so when you call:

use POSIX;

about 500 KB worth of symbols gets imported.

Usually, we don't need POSIX or its symbols in the startup file; all we want is to preload it. Therefore, we use an empty list as an argument for use( ):

use POSIX ( );

so the POSIX::import( ) method won't be even called.

When we want to use the POSIX module in the code, we use( ) it again, but this time no loading overhead occurs because the module has been loaded already. If we want to import something from the module, we supply the list of symbols to load:

use POSIX qw(:flock_h);

This example loads constants used with the flock( ) function.

Technically, you aren't required to supply the use( ) statement in your handler code if the module has already been loaded during server startup or elsewhere. When writing your code, however, don't assume that the module code has been preloaded. Someday in the future, you or someone else will revisit this code and will not understand how it is possible to use a module's methods without first loading the module itself.

Please refer to the Exporter and perlmod manpages, and to the section on use( ) in the perlfunc manpage for more information about import( ).

Remember that you can always use require( ) to preload the files at server startup if you don't add ( ), because:

require Data::Dumper;

is the same as:

use Data::Dumper ( );

except that it's not executed at compile-time.

4.4 Apache Configuration in Perl

With <Perl> ... </Perl> sections, you can configure your server entirely in Perl. It's probably not worth it if you have simple configuration files, but if you run many virtual hosts or have complicated setups for any other reason, <Perl> sections become very handy. With <Perl> sections you can easily create the configuration on the fly, thus reducing duplication and easing maintenance.^[5]

^[5] You may also find that mod_macro is useful to simplify the configuration if you have to insert many repetitive configuration snippets.

To enable <Perl> sections, build mod_perl with:

panic% perl Makefile.PL PERL_SECTIONS=1 [ ... ]

or with EVERYTHING=1.

4.4.1 Constructing <Perl> Sections

<Perl> sections can contain any and as much Perl code as you wish. <Perl> sections are compiled into a special package called Apache::ReadConfig. mod_perl looks through the symbol table for Apache::ReadConfig for Perl variables and structures to grind through the Apache core configuration gears. Most of the configuration directives can be represented as scalars ($scalar) or arrays (@array). A few directives become hashes.

How do you know which Perl global variables to use? Just take the Apache directive name and prepend either $, @, or % (as shown in the following examples), depending on what the directive accepts. If you misspell the directive, it is silently ignored, so it's a good idea to check your settings.

Since Apache directives are case-insensitive, their Perl equivalents are case-insensitive as well. The following statements are equivalent:

$User = 'stas';
 $user = 'stas'; # the same

Let's look at all possible cases we might encounter while configuring Apache in Perl:

Directives that accept zero or one argument are represented as scalars. For example, CacheNegotiatedDocs is a directive with no arguments. In Perl, we just assign it an empty string:
```
<Perl>
     $CacheNegotiatedDocs = '';
 </Perl>
```
Directives that accept a single value are simple to handle. For example, to configure Apache so that child processes run as user httpd and group httpd, use:
```
User  = httpd
 Group = httpd
```
What if we don't want user and group definitions to be hardcoded? Instead, what if we want to define them on the fly using the user and group with which the server is started? This is easily done with <Perl> sections:
```
<Perl>
     $User  = getpwuid($>) || $>;
     $Group = getgrgid($)) || $);
 </Perl>
```
We use the power of the Perl API to retrieve the data on the fly. $User is set to the name of the effective user ID with which the server was started or, if the name is not defined, the numeric user ID. Similarly, $Group is set to either the symbolic value of the effective group ID or the numeric group ID.

Notice that we've just taken the Apache directives and prepended a $, as they represent scalars.
Directives that accept more than one argument are represented as arrays or as a space-delimited string. For example, this directive:
```
PerlModule Mail::Send Devel::Peek
```
becomes:
```
<Perl>
     @PerlModule = qw(Mail::Send Devel::Peek);
 </Perl>
```
@PerlModule is an array variable, and we assign it a list of modules. Alternatively, we can use the scalar notation and pass all the arguments as a space-delimited string:
```
<Perl>
     $PerlModule = "Mail::Send Devel::Peek";
 </Perl>
```

Directives that can be repeated more than once with different values are represented as arrays of arrays. For example, this configuration:

AddEncoding x-compress Z
 AddEncoding x-gzip gz tgz

becomes:

<Perl>
     @AddEncoding = (
         ['x-compress' => qw(Z)],
         ['x-gzip'     => qw(gz tgz)],
     );
 </Perl>

Directives that implement a container block, with beginning and ending delimiters such as <Location> ... </Location>, are represented as Perl hashes. In these hashes, the keys are the arguments of the opening directive, and the values are the contents of the block. For example:
```
Alias /private /home/httpd/docs/private
 <Location /private>
     DirectoryIndex  index.html index.htm
     AuthType        Basic
     AuthName        "Private Area"
     AuthUserFile    /home/httpd/docs/private/.htpasswd
     Require         valid-user
 </Location>
```
These settings tell Apache that URIs starting with /private are mapped to the physical directory /home/httpd/docs/private/ and will be processed according to the following rules:
- The users are to be authenticated using basic authentication.
- PrivateArea will be used as the title of the pop-up box displaying the login and password entry form.
- Only valid users listed in the password file /home/httpd/docs/private/.htpasswd and who provide a valid password may access the resources under /private/.
- If the filename is not provided, Apache will attempt to respond with the index.html or index.htm directory index file, if found.
Now let's see the equivalent <Perl> section:
```
<Perl>
     push @Alias, qw(/private /home/httpd/docs/private);
     $Location{"/private"} = {
         DirectoryIndex => [qw(index.html index.htm)],
         AuthType       => 'Basic',
         AuthName       => '"Private Area"',
         AuthUserFile   => '/home/httpd/docs/private/.htpasswd',
         Require        => 'valid-user',
     };
 </Perl>
```
First, we convert the Alias directive into an array @Alias. Instead of assigning, however, we push the values at the end. We do this because it's possible that we have assigned values earlier, and we don't want to overwrite them. Alternatively, you may want to push references to lists, like this:
```
push @Alias, [qw(/private /home/httpd/docs/private)];
```
Second, we convert the Location block, using /private as a key to the hash %Location and the rest of the block as its value. When the structures are nested, the normal Perl rules applythat is, arrays and hashes turn into references. Therefore, DirectoryIndex points to an array reference. As shown earlier, we can always replace this array with a space-delimited string:
```
$Location{"/private"} = {
     DirectoryIndex => 'index.html index.htm',
     ...
 };
```
Also notice how we specify the value of the AuthName attribute:
```
AuthName => '"Private Area"',
```
The value is quoted twice because Apache expects a single value for this argument, and if we write:
```
AuthName => 'Private Area',
```
<Perl> will pass two values to Apache, "Private" and "Area", and Apache will refuse to start, with the following complaint:
```
[Thu May 16 17:01:20 2002] [error] <Perl>: AuthName takes one
 argument, The authentication realm (e.g. "Members Only")
```

If a block section accepts two or more identical keys (as the <VirtualHost> ... </VirtualHost> section does), the same rules as in the previous case apply, but a reference to an array of hashes is used instead.

In one company, we had to run an Intranet machine behind a NAT/firewall (using the 10.0.0.10 IP address). We decided up front to have two virtual hosts to make both the management and the programmers happy. We had the following simplistic setup:

NameVirtualHost 10.0.0.10
 
 <VirtualHost 10.0.0.10>
     ServerName  tech.intranet
     DocumentRoot /home/httpd/docs/tech
     ServerAdmin webmaster@tech.intranet
 </VirtualHost>
 
 <VirtualHost 10.0.0.10>
     ServerName   suit.intranet
     DocumentRoot /home/httpd/docs/suit
     ServerAdmin  webmaster@suit.intranet
 </VirtualHost>

In Perl, we wrote it as follows:

<Perl>
     $NameVirtualHost => '10.0.0.10';
     my $doc_root = "/home/httpd/docs";
     $VirtualHost{'10.0.0.10'} = [
         {
          ServerName   => 'tech.intranet',
          DocumentRoot => "$doc_root/tech",
          ServerAdmin  => 'webmaster@tech.intranet',
         },
         {
          ServerName   => 'suit.intranet',
          DocumentRoot => "$doc_root/suit",
          ServerAdmin  => 'webmaster@suit.intranet',
         },
     ];
 </Perl>

Because normal Perl rules apply, more entries can be added as needed using push( ).^[6] Let's say we want to create a special virtual host for the company's president to show off to his golf partners, but his fancy vision doesn't really fit the purpose of the Intranet site. We just let him handle his own site:

^[6] For complex configurations with multiple entries, consider using the module Tie::DxHash, which implements a hash that preserves insertion order and allows duplicate keys.

push @{ $VirtualHost{'10.0.0.10'} },
     {
      ServerName   => 'president.intranet',
      DocumentRoot => "$doc_root/president",
      ServerAdmin  => 'webmaster@president.intranet',
     };

Nested block directives naturally become Perl nested data structures. Let's extend an example from the previous section:

<Perl>
     my $doc_root = "/home/httpd/docs";
     push @{ $VirtualHost{'10.0.0.10'} },
         {
          ServerName   => 'president.intranet',
          DocumentRoot => "$doc_root/president",
          ServerAdmin  => 'webmaster@president.intranet',
          Location     => {
              "/private"    => {
                  Options       => 'Indexes',
                  AllowOverride => 'None',
                  AuthType      => 'Basic',
                  AuthName      => '"Do Not Enter"',
                  AuthUserFile  => 'private/.htpasswd',
                  Require       => 'valid-user',
              },
              "/perlrun" => {
                  SetHandler     => 'perl-script',
                  PerlHandler    => 'Apache::PerlRun',
                  PerlSendHeader => 'On',
                  Options        => '+ExecCGI',
              },
          },
         };
 </Perl>

We have added two Location blocks. The first, /private, is for the juicy stuff and accessible only to users listed in the president's password file. The second, /perlrun, is for running dirty Perl CGI scripts, to be handled by the Apache::PerlRun handler.

<Perl> sections don't provide equivalents for <IfModule> and <IfDefine> containers. Instead, you can use the module( ) and define( ) methods from the Apache package. For example:
```
<IfModule mod_ssl.c>
     Include ssl.conf
 </IfModule>
```
can be written as:
```
if (Apache->module("mod_ssl.c")) {
     push @Include, "ssl.conf";
 }
```
And this configuration example:
```
<IfDefine SSL>
     Include ssl.conf
 </IfDefine>
```
can be written as:
```
if (Apache->define("SSL")) {
     push @Include, "ssl.conf";
 }
```
Now that you know how to convert the usual configuration directives to Perl code, there's no limit to what you can do with it. For example, you can put environment variables in an array and then pass them all to the children with a single configuration directive, rather than listing each one via PassEnv or PerlPassEnv:
```
<Perl>
     my @env = qw(MYSQL_HOME CVS_RSH);
     push @PerlPassEnv, \@env;
 </Perl>
```
Or suppose you have a cluster of machines with similar configurations and only small distinctions between them. Ideally, you would want to maintain a single configuration file, but because the configurations aren't exactly the same (for example, the ServerName directive will have to differ), it's not quite that simple.

<Perl> sections come to the rescue. Now you can have a single configuration file and use the full power of Perl to tweak the local configuration. For example, to solve the problem of the ServerName directive, you might have this <Perl> section:
```
<Perl>
     use Sys::Hostname;
     $ServerName = hostname( );
 </Perl>
```
and the right machine name will be assigned automatically.

Or, if you want to allow personal directories on all machines except the ones whose names start with secure, you can use:
```
<Perl>
     use Sys::Hostname;
     $ServerName = hostname( );
     if ($ServerName !~ /^secure/) {
         $UserDir = "public.html";
     }
 </Perl>
```

4.4.2 Breaking Out of <Perl> Sections

Behind the scenes, mod_perl defines a package called Apache::ReadConfig in which it keeps all the variables that you define inside the <Perl> sections. So <Perl> sections aren't the only way to use mod_perl to configure the server: you can also place the Perl code in a separate file that will be called during the configuration parsing with either PerlModule or PerlRequire directives, or from within the startup file. All you have to do is to declare the package Apache::ReadConfig before writing any code in this file.

Using the last example from the previous section, we place the code into a file named apache_config.pl, shown in Example 4-4.

Example 4-4. apache_config.pl

package Apache::ReadConfig;
 
 use Sys::Hostname;
 $ServerName = hostname( );
 if ($ServerName !~ /^secure/) {
     $UserDir = "public.html";
 }
 1;

Then we execute it either from httpd.conf:

PerlRequire /home/httpd/perl/lib/apache_config.pl

or from the startup.pl file:

require "/home/httpd/perl/lib/apache_config.pl";

4.4.3 Cheating with Apache->httpd_conf

In fact, you can create a complete configuration file in Perl. For example, instead of putting the following lines in httpd.conf:

NameVirtualHost         10.0.0.10
 
 <VirtualHost 10.0.0.10>
     ServerName  tech.intranet
     DocumentRoot /home/httpd/httpd_perl/docs/tech
     ServerAdmin webmaster@tech.intranet
 </VirtualHost>
 
 <VirtualHost 10.0.0.10>
     ServerName   suit.intranet
     DocumentRoot /home/httpd/httpd_perl/docs/suit
     ServerAdmin webmaster@suit.intranet
 </VirtualHost>

You can write it in Perl:

use Socket;
 use Sys::Hostname;
 my $hostname = hostname( );
 (my $domain = $hostname) =~ s/[^.]+\.//;
 my $ip = inet_ntoa(scalar gethostbyname($hostname || 'localhost'));
 my $doc_root = '/home/httpd/docs';
 
 Apache->httpd_conf(qq{
 NameVirtualHost $ip
 
 <VirtualHost $ip>
   ServerName  tech.$domain
   DocumentRoot $doc_root/tech
   ServerAdmin webmaster\@tech.$domain
 </VirtualHost>
 
 <VirtualHost $ip>
   ServerName   suit.$domain
   DocumentRoot $doc_root/suit
   ServerAdmin  webmaster\@suit.$domain
 </VirtualHost>
  });

First, we prepare the data, such as deriving the domain name and IP address from the hostname. Next, we construct the configuration file in the "usual" way, but using the variables that were created on the fly. We can reuse this configuration file on many machines, and it will work anywhere without any need for adjustment.

Now consider that you have many more virtual hosts with a similar configuration. You have probably already guessed what we are going to do next:

use Socket;
 use Sys::Hostname;
 my $hostname = hostname( );
 (my $domain = $hostname) =~ s/[^.]+\.//;
 my $ip = inet_ntoa(scalar gethostbyname($hostname || 'localhost'));
 my $doc_root = '/home/httpd/docs';
 my @vhosts = qw(suit tech president);
 
 Apache->httpd_conf("NameVirtualHost $ip");
 
 for my $vh (@vhosts) {
   Apache->httpd_conf(qq{
 <VirtualHost $ip>
   ServerName  $vh.$domain
   DocumentRoot $doc_root/$vh
   ServerAdmin webmaster\@$vh.$domain
 </VirtualHost>
  });
 }

In the loop, we create new virtual hosts. If we need to create 100 hosts, it doesn't take a long timejust adjust the @vhosts array.

4.4.4 Declaring Package Names in Perl Sections

Be careful when you declare package names inside <Perl> sections. For example, this code has a problem:

<Perl>
     package Book::Trans;
     use Apache::Constants qw(:common);
     sub handler { OK }
 
     $PerlTransHandler = "Book::Trans";
 </Perl>

When you put code inside a <Perl> section, by default it goes into the Apache::ReadConfig package, which is already declared for you. This means that the PerlTransHandler we tried to define will be ignored, since it's not a global variable in the Apache::ReadConfig package.

If you define a different package name within a <Perl> section, make sure to close the scope of that package and return to the Apache::ReadConfig package when you want to define the configuration directives. You can do this by either explicitly declaring the Apache::ReadConfig package:

<Perl>
     package Book::Trans;
     use Apache::Constants qw(:common);
     sub handler { OK }
 
     package Apache::ReadConfig;
     $PerlTransHandler = "Book::Trans";
 </Perl>

or putting the code that resides in a different package into a block:

<Perl>
     {
         package Book::Trans;
         use Apache::Constants qw(:common);
         sub handler { OK }
     }
 
     $PerlTransHandler = "Book::Trans";
 </Perl>

so that when the block is over, the Book::Trans package's scope is over, and you can use the configuration variables again.

However, it's probably a good idea to use <Perl> sections only to create or adjust configuration directives. If you need to run some other code not related to configuration, it might be better to place it in the startup file or in its own module. Your mileage may vary, of course.

4.4.5 Verifying <Perl> Sections

How do we know whether the configuration made inside <Perl> sections was correct?

First we need to check the validity of the Perl syntax. To do that, we should turn it into a Perl script, by adding #!perl at the top of the section:

<Perl>
 #!perl
 # ... code here ...
 _ _END_ _
 </Perl>

Notice that #!perl and _ _END_ _ must start from the column zero. Also, the same rules as we saw earlier with validation of the startup file apply: if the <Perl> section includes some modules that can be loaded only when mod_perl is running, this validation is not applicable.

Now we may run:

perl -cx httpd.conf

If the Perl code doesn't compile, the server won't start. If the Perl code is syntactically correct, but the generated Apache configuration is invalid, <Perl> sections will just log a warning and carry on, since there might be globals in the section that are not intended for the configuration at all.

If you have more than one <Perl> section, you will have to repeat this procedure for each section, to make sure they all work.

To check the Apache configuration syntax, you can use the variable $Apache::Server::StrictPerlSections, added in mod_perl Version 1.22. If you set this variable to a true value:

$Apache::Server::StrictPerlSections = 1;

then mod_perl will not tolerate invalid Apache configuration syntax and will croak (die) if it encounters invalid syntax. The default value is 0. If you don't set $Apache::Server::StrictPerlSections to 1, you should localize variables unrelated to configuration with my( ) to avoid errors.

If the syntax is correct, the next thing we need to look at is the parsed configuration as seen by Perl. There are two ways to see it. First, we can dump it at the end of the section:

<Perl>
     use Apache::PerlSections ( );
     # code goes here
     print STDERR Apache::PerlSections->dump( );
 </Perl>

Here, we load the Apache::PerlSections module at the beginning of the section, and at the end we can use its dump( ) method to print out the configuration as seen by Perl. Notice that only the configuration created in the section will be seen in the dump. No plain Apache configuration can be found there.

For example, if we adjust this section (parts of which we have seen before) to dump the parsed contents:

<Perl>
     use Apache::PerlSections ( );
     $User  = getpwuid($>) || $>;
     $Group = getgrgid($)) || $);
     push @Alias, [qw(/private /home/httpd/docs/private)];
     my $doc_root = "/home/httpd/docs";
     push @{ $VirtualHost{'10.0.0.10'} },
         {
          ServerName   => 'president.intranet',
          DocumentRoot => "$doc_root/president",
          ServerAdmin  => 'webmaster@president.intranet',
          Location     => {
              "/private"    => {
                  Options       => 'Indexes',
                  AllowOverride => 'None',
                  AuthType      => 'Basic',
                  AuthName      => '"Do Not Enter"',
                  AuthUserFile  => 'private/.htpasswd',
                  Require       => 'valid-user',
              },
              "/perlrun" => {
                  SetHandler     => 'perl-script',
                  PerlHandler    => 'Apache::PerlRun',
                  PerlSendHeader => 'On',
                  Options        => '+ExecCGI',
              },
          },
         };
     print STDERR Apache::PerlSections->dump( );
 </Perl>

This is what we get as a dump:

package Apache::ReadConfig;
 #hashes:
 
 %VirtualHost = (
   '10.0.0.10' => [
     {
       'Location' => {
         '/private' => {
           'AllowOverride' => 'None',
           'AuthType' => 'Basic',
           'Options' => 'Indexes',
           'AuthUserFile' => 'private/.htpasswd',
           'AuthName' => '"Do Not Enter"',
           'Require' => 'valid-user'
         },
         '/perlrun' => {
           'PerlHandler' => 'Apache::PerlRun',
           'Options' => '+ExecCGI',
           'PerlSendHeader' => 'On',
           'SetHandler' => 'perl-script'
         }
       },
       'DocumentRoot' => '/home/httpd/docs/president',
       'ServerAdmin' => 'webmaster@president.intranet',
       'ServerName' => 'president.intranet'
     }
   ]
 );
 
 #arrays:
 
 @Alias = (
   [
     '/private',
     '/home/httpd/docs/private'
   ]
 );
 
 #scalars:
 
 $Group = 'stas';
 
 $User = 'stas';
 
 1;
 _ _END_ _

You can see that the configuration was created properly. The dump places the output into three groups: arrays, hashes, and scalars. The server was started as user stas, so the $User and $Group settings were dynamically assigned to the user stas.

A different approach to seeing the dump at any time (not only during startup) is to use the Apache::Status module (see Chapter 9). First we store the Perl configuration:

<Perl>
     $Apache::Server::SaveConfig = 1;
     # the actual configuration code
 </Perl>

Now the Apache::ReadConfig namespace (in which the configuration data is stored) will not be flushed, making configuration data available to Perl modules at request time. If the Apache::Status module is configured, you can view it by going to the /perl-status URI (or another URI that you have chosen) in your browser and selecting "Perl Section Configuration" from the menu. The configuration data should look something like that shown in Figure 4-1.

Figure 4-1. <Perl> sections configuration dump

Since the Apache::ReadConfig namespace is not flushed when the server is started, you can access the configuration values from your codethe data resides in the Apache::ReadConfig package. So if you had the following Perl configuration:

<Perl>
     $Apache::Server::SaveConfig = 1;
     $DocumentRoot = "/home/httpd/docs/mine";
 </Perl>

at request time, you could access the value of $DocumentRoot with the fully qualified name $Apache::ReadConfig::DocumentRoot. But usually you don't need to do this, because mod_perl provides you with an API to access to the most interesting and useful server configuration bits.

4.4.6 Saving the Perl Configuration

Instead of dumping the generated Perl configuration, you may decide to store it in a file. For example, if you want to store it in httpd_config.pl, you can do the following:

<Perl>
     use Apache::PerlSections ( );
     # code goes here
     Apache::PerlSections->store("httpd_config.pl");
 </Perl>

You can then require( ) that file in some other <Perl> section. If you have the whole server configuration in Perl, you can start the server using the following trick:

panic% httpd -C "PerlRequire httpd_config.pl"

Apache will fetch all the configuration directives from httpd_config.pl, so you don't need httpd.conf at all.

4.4.7 Debugging

If your configuration doesn't seem to do what it's supposed to do, you should debug it. First, build mod_perl with:

panic% perl Makefile.PL PERL_TRACE=1 [...]

Next, set the environment variable MOD_PERL_TRACE to s (as explained in Chapter 21). Now you should be able to see how the <Perl> section globals are converted into directive string values. For example, suppose you have the following Perl section:

<Perl>
     $DocumentRoot = "/home/httpd/docs/mine";
 </Perl>

If you start the server in single-server mode (e.g., under bash):

panic% MOD_PERL_TRACE=s httpd -X

you will see these lines among the printed trace:

...
 SVt_PV: $DocumentRoot = `/home/httpd/docs/mine'
 handle_command (DocumentRoot /home/httpd/docs/mine): OK
 ...

But what if you mistype the directory name and pass two values instead of a single value? When you start the server, you'll see the following error:

...
 SVt_PV: $DocumentRoot = `/home/httpd/docs/ mine'
 handle_command (DocumentRoot /home/httpd/docs/ mine):
 DocumentRoot takes one argument,
 Root directory of the document tree
 ...

and of course the error will be logged in the error_log file:

[Wed Dec 20 23:47:31 2000] [error]
 (2)No such file or directory: <Perl>:
 DocumentRoot takes one argument,
 Root directory of the document tree

4.5 Validating the Configuration Syntax

Before you restart a server on a live production machine after the configuration has been changed, it's essential to validate that the configuration file is not broken. If the configuration is broken, the server won't restart and users will find your server offline for the time it'll take you to fix the configuration and start the server again.

You can use apachectl configtest or httpd -t to validate the configuration file without starting the server. You can safely validate the configuration file on a running production server, as long as you run this test before you restart the server with apachectl restart. Of course, it is not 100% perfect, but it will reveal any syntax errors you might have made while editing the file.

The validation procedure doesn't just parse the code in startup.pl, it executes it too. <Perl> sections invoke the Perl interpreter when reading the configuration files, and PerlRequire and PerlModule do so as well.

Of course, we assume that the code that gets called during this test cannot cause any harm to your running production environment. If you're worried about that, you can prevent the code in the startup script and in <Perl> sections from being executed during the syntax check. If the server configuration is tested with -Dsyntax_check:

panic% httpd -t -Dsyntax_check

you can check in your code whether syntax_check was set with:

Apache->define('syntax_check')

If, for example, you want to prevent the code in startup.pl from being executed, add the following at the top of the code:

return if Apache->define('syntax_check');

Of course, there is nothing magical about using the string 'syntax_check' as a flagyou can use any other string as well.

4.6 The Scope of mod_perl Configuration Directives

Table 4-1 depicts where the various mod_perl configuration directives can be used.

Table 4-1. The Scope of mod_perl configuration directives

Directive

Global

<VirtualHost>

<Directory>

PerlTaintCheck

V

PerlWarn

V

PerlFreshRestart

V

PerlPassEnv

V

V

PerlRequire

V

V

V

PerlModule

V

V

V

PerlAddVar

V

V

V

PerlSetEnv

V

V

V

PerlSetVar

V

V

V

PerlSetupEnv

V

V

V

PerlSendHeader

V

V

V

<Perl> Sections

V

V

V

The first column represents directives that can appear in the global configuration; that is, outside all sections. Note that PerlTaintCheck, PerlWarn, and PerlFreshRestart can be placed inside <VirtualHost> sections. However, because there's only one Perl interpreter for all virtual hosts and the main server, setting any of these values in one virtual host affects all other servers. Therefore, it's probably a good idea to think of these variables as being allowed only in the global configuration.

The second column represents directives that can appear inside the <VirtualHost> sections.

The third column represents directives that can appear in the <Directory>, <Location>, and <Files> sections and all their regex variants. These mod_perl directives can also appear in .htaccess files.

For example, PerlWarn cannot be used in <Directory> and <VirtualHost> sections. However, PerlSetEnv can be used anywhere, which allows you to provide different behavior in different sections:

PerlSetEnv ADMIN_EMAIL webmaster@example.com
 <Location /bar/manage/>
     PerlSetEnv ADMIN_EMAIL bar@example.com
 </Location>

In this example, a handler invoked from /bar/manage/ will see the ADMIN_EMAIL environment variable as bar@example.com, while other handlers configured elsewhere will see ADMIN_EMAIL as the default value, webmaster@example.com.

4.7 Apache Restarts Twice

When the server is restarted, the configuration and module initialization phases are called twice before the children are forked. The second restart is done to test that all modules can survive a restart (SIGHUP), in order to ensure that future graceful restarts will work correctly. This is very important if you are going to restart a production server.

You can control what Perl code will be executed on the start or restart by checking the values of $Apache::Server::Starting and $Apache::Server::ReStarting. The former variable is true when the server is starting, and the latter is true when it's restarting.

For example, if you want to be notified when the server starts or restarts, you can do:

<Perl>
     email_notify("start")   if $Apache::Server::Starting;
     email_notify("restart") if $Apache::Server::ReStarting;
 </Perl>

where the function email_notify( ) (that you have to write) performs the notification. Since Apache restarts itself on start, you will get both notifications when Apache is started, and only one when it's restarted.

The startup.pl file and similar files loaded via PerlModule or PerlRequire are compiled only once, because once the module is compiled, it enters the special %INC hash. When Apache restarts, Perl checks whether the module or script in question is already registered in %INC and won't try to compile it again.

Thus, the only code that you might need to protect from running on restart is the code in <Perl> sections. But since <Perl> sections are primarily used for creating on-the-fly configurations, it shouldn't be a problem to run the code more than once.

4.8 Enabling Remote Server Configuration Reports

The nifty mod_info Apache module displays the complete server configuration in your browser. In order to use it, you have to compile it in or, if the server was compiled with DSO mode enabled, load it as an object. Then just uncomment the already prepared section in the httpd.conf file:

<Location /server-info>
     SetHandler server-info
     Order deny,allow
     Deny from all
     Allow from localhost
 </Location>

Now restart the server and issue the request:

http://localhost/server-info

We won't show a snapshot of the output here, as it's very lengthy. However, you should know that mod_info is unaware of the configuration created or modified by <Perl> sections or equivalent methods discussed earlier in this chapter.

4.9 Tips and Tricks

The following are miscellaneous tips and tricks that might save you lots of time when configuring mod_perl and Apache.

4.9.1 Publishing Port Numbers Other Than 80

If you are using a dual-server setup, with a mod_perl server listening on a high port (e.g., 8080), don't publish the high port number in URLs. Rather, use a proxying rewrite rule in the non-mod_perl server:

RewriteEngine      On
 RewriteLogLevel    0
 RewriteRule       ^/perl/(.*) http://localhost:8080/perl/$1 [P]
 ProxyPassReverse   /          http://localhost/

In the above example, all the URLs starting with /perl are rewritten to the backend server, listening on port 8080. The backend server is not directly accessible; it can be reached only through the frontend server.

One of the problems with publishing high port numbers is that Microsoft Internet Explorer (IE) 4.x has a bug when re-posting data to a URL with a nonstandard port (i.e., anything but 80). It drops the port designator and uses port 80 anyway. Hence, your service will be unusable for IE 4.x users.

Another problem is that firewalls will probably have most of the high ports closed, and users behind them will be unable to reach your service if it is running on a blocked port.

4.9.2 Running the Same Script from Different Virtual Hosts

When running under a virtual host, Apache::Registry and other registry family handlers will compile each script into a separate package. The package name includes the name of the virtual host if the variable $Apache::Registry::NameWithVirtualHost is set to 1. This is the default behavior.

Under this setting, two virtual hosts can have two different scripts accessed via the same URI (e.g., /perl/guestbook.pl) without colliding with each other. Each virtual host will run its own version of the script.

However, if you run a big service and provide a set of identical scripts to many virtual hosts, you will want to have only one copy of each script compiled in memory. By default, each virtual host will create its own copy, so if you have 100 virtual hosts, you may end up with 100 copies of the same script compiled in memory, which is very wasteful. If this is the case, you can override the default behavior by setting the following directive in a startup file or in a <Perl> section:

$Apache::Registry::NameWithVirtualHost = 0;

But be careful: this makes sense only if you are sure that there are no other scripts with identical URIs but different content on different virtual hosts.

Users of mod_perl v1.15 are encouraged to upgrade to the latest stable version if this problem is encounteredit was solved starting with mod_perl v1.16.

4.10 Configuration Security Concerns

Any service open to the Internet at large must take security into account. Large, complex software tends to expose subtle vulnerabilities that attackers can exploit to gain unauthorized access to the server host. Third-party modules or libraries can also contain similarly exploitable bugs. Perl scripts aren't immune either: incorrect untainting and sanitizing of user input can lead to disaster when this input is fed to the open( ) or system( ) functions.

Also, if the same mod_perl server is shared by more than one user, you may need to protect users of the server from each other (see Appendix C).

4.10.1 Using Only Absolutely Necessary Components

The more modules you have enabled in your web server, the more complex the code and interaction between these modules will be. The more complex the code in your web server, the more chances for bugs there are. The more chances for bugs, the more chance there is that some of those bugs may involve security holes.

Before you put the server into production, review the server setup and disable any unused modules. As time goes by, the server enviroment may change and some modules may not be used anymore. Do periodical revisions of your setups and disable modules that aren't in use.

4.10.2 Taint Checking

Make sure to run the server with the following setting in the httpd.conf file:

PerlTaintCheck On

As discussed in Chapter 6, taint checking doesn't ensure that your code is completely safe from external hacks, but it does force you to improve your code to prevent many potential security problems.

4.10.3 Hiding Server Information

We aren't completely sure why the default value of the ServerTokens directive in Apache is Full rather than Minimal. It seems like Full is really useful only for debugging purposes. A probable reason for using ServerTokens Full is publicity: it means that Netcraft (http://netcraft.com/) and other similar survey services will count more Apache servers, which is good for all of us. In general, though, you really want to reveal as little information as possible to potential crackers.

Another approach is to modify the httpd sources to not reveal any unwanted information, so that all responses return an empty or phony Server: field.

Be aware, however, that there's no security by obscurity (as the old saying goes). Any determined cracker will eventually figure out what version of Apache is running and what third-party modules are built in.

You can see what information is revealed by your server by telneting to it and issuing some request. For example:

panic% telnet localhost 8080
 Trying 127.0.0.1
 Connected to localhost
 Escape character is '^]'.
 HEAD / HTTP/1.0
 
 HTTP/1.1 200 OK
 Date: Sun, 16 Apr 2000 11:06:25 GMT
 Server: Apache/1.3.24 (Unix) mod_perl/1.26 mod_ssl/2.8.8 OpenSSL/0.9.6
 [more lines snipped]

As you can see, a lot of information is revealed when ServerTokens Full has been specified.

4.10.4 Making the mod_perl Server Inaccessible from the Outside

It is best not to expose mod_perl to the outside world, as it creates a potential security risk by revealing which modules you use and which operating system you are running your web server on. In Chapter 12, we show how to make mod_perl inaccessible directly from the outside by listening only to the request coming from mod_proxy at the local host (127.0.0.1).

4.10.5 Protecting Private Status Locations

It's a good idea to protect your various monitors, such as /perl-status, by password. The less information you provide for intruders, the harder it will be for them to break in. (One of the biggest helps you can provide for these bad guys is to show them all the scripts you use. If any of these are in the public domain, they can grab the source of the script from the Web, study it, and probably find a few or even many security holes in it.)

Security by obscurity may help to wave away some of the less-determined malicious fellas, but it doesn't really work against a determined intruder. For example, consider the old <Limit> container:

<Location /sys-monitor>
     SetHandler perl-script
     PerlHandler Apache::VMonitor
     AuthUserFile /home/httpd/perl/.htpasswd
     AuthGroupFile /dev/null
     AuthName "Server Admin"
     AuthType Basic
     <Limit GET POST>
         require user foo bar
     </Limit>
 </Location>

Use of the <Limit> container is a leftover from NCSA server days that is still visible in many configuration examples today. In Apache, it will limit the scope of the require directive to the GET and POST request methods. Use of another method will bypass authentication. Since most scripts don't bother checking the request method, content will be served to the unauthenticated users.

For this reason, the Limit directive generally should not be used. Instead, use this secure configuration:

<Location /sys-monitor>
     SetHandler perl-script
     PerlHandler Apache::VMonitor
     AuthUserFile /home/httpd/perl/.htpasswd
     AuthGroupFile /dev/null
     AuthName "Server Admin"
     AuthType Basic
     require user foo bar
 </Location>

The contents of the password file (/home/httpd/perl/.htpasswd) are populated by the htpasswd utility, which comes bundled with Apache:

foo:1SA3h/d27mCp
 bar:WbWQhZM3m4kl

Оставьте свой комментарий !

Ваше имя:

Комментарий:

Оба поля являются обязательными

Автор	Комментарий к данной статье