A look at Apache modules

Now for some real Apache code


Column As I write this, I'm on the Eurostar train, just returning from O'Reilly's OSCon (Open Source Conference) in Brussels. Some fascinating insights there; and even my own talk generated some interesting discussion. Some of the delegates, including O'Reilly himself, are promoting opensource ideas going beyond software and into society more generally. I've touched on that in this very column before now, but a related argument that's new to me is that the open/closed debate in software could become largely sidelined, as the industry focuses on software as a service (such as Google's offerings) more than as a product.

Anyway, I've been meaning for some time to give you an article on writing Apache modules. But first things first. If you're going to write modules, you'll need a proper Apache installation. A safe option is to download the latest release version (currently 2.2.3) from httpd.apache.org, and install it from that. For the adventurous, you could install the current development version from svn.apache.org. If you use a different package such as an rpm or deb, you'll probably need an "apache-dev" package as well as Apache itself.

What is a module?

As I'm sure I've said, Apache has a diverse developer and user profile, and we all have very different uses for it. This approach to serving a wide range of needs is based on a small core, together with a large number of modules. Most of what you get when you install Apache as a package, from apache.org or elsewhere, comprises the modules that perform Apache's standard functions. Even such a simple task as serving an "index.html" file involves various modules: for example, mod_dir to resolve the URL http://www.example.com/ to the file index.html, and mod_mime to determine its MIME type as "text/html" so the browser knows how to render it.

Just as Apache's standard functions are driven by modules, so we can write new modules to change its behaviour, or introduce entirely new capabilities. Some examples of the kind of things we can do with modules include:

  • A content generator module takes an HTTP request and generates a response, in the manner of, for example, a CGI or PHP script. The default handler simply serves up a static file from local disc, while others may implement a service such as XML-RPC, do custom processing, or (like mod_cgi) delegate the work to an external script.
  • A mapper module runs before content generation, and determines how a request will be processed. For example, mod_negotiation selects amongst different versions of a document (e.g. different languages) according to browser preferences, while mod_alias and mod_rewrite perform rule-based URL manipulation.
  • An authentication module ascertains the identity of a user. When used, it is usually accompanied by an authorization module, which determines whether the user is permitted the attempted operation.
  • A filter module transforms incoming and/or outgoing data. Filters may be chained arbitrarily, and are the building blocks for sophisticated processing and aggregation applications. They range from simple content manipulation such as server side includes, through compression, to SSL encryption, and include many of the most exciting third-party applications.
  • A service module may export an entirely new API and/or service for other modules. For example, mod_dbd manages SQL database connections, and mod_xmlns exports an API for namespace-based processing of XML.

A HelloWorld Module

Conceptually, the simplest type of module is the content generator or handler, whose role in Apache is directly equivalent to a CGI or PHP script. That is to say, it processes a request in whatever manner is required, and generates a response to return to the Client. It is not required to deal with the details of the HTTP protocol, though (as with a script) it may do that, or any number of other things. Usually it's good to keep the content generator simple, and use other types of module for different tasks.

So in the spirit of simplicity, let's take a look at a minimal HelloWorld module. Note that, unlike a script, this doesn't live amongst our web documents, so we can't run it straight from the filesystem. We'll need to configure it using a directive such as SetHandler instead:

LoadModule helloworld_module modules/mod_helloworld.so
<Location /helloworld>
        SetHandler helloworld
</Location>

Here's a function to return a HelloWorld page to the client. The prototype is typical: it takes the request_rec (HTTP Request) object as a single argument, and returns an integer status code. The request_rec provides access to everything a handler might need (such as the variables available to a script) and also serves as an I/O descriptor, among other things:

static int helloworld_handler(request_rec *r) {
  /* First, some housekeeping. */
  if (!r->handler || strcasecmp(r->handler, "helloworld") != 0) {
    /* r->handler wasn't "helloworld", so it's none of our business */
    return DECLINED;
  }

  if (r->method_number != M_GET) {
    /* We only accept GET and HEAD requests.
     * They are identical for the purposes of a content generator
     * Returning an HTTP error code causes Apache to return an
     * error page (ErrorDocument) to the client.
     */
    return HTTP_METHOD_NOT_ALLOWED;
  }

  /* OK, we're happy with this request, so we'll return the response. */

  ap_set_content_type(r, "text/html");
  ap_rputs("<title>Hello World!</title> .... etc", r);

  /* we return OK to indicate that we have successfully processed
   * the request.  No further processing is required.
   */
  return OK;
}

So, that's our handler function. Now we need to hook it in to Apache's processing, so it will be run when we get a request for /helloworld. We use a special function that runs at server startup to register our handler with Apache:

static void helloworld_hooks(apr_pool_t *pool) {
  /* hook helloworld_handler in to Apache */
  ap_hook_handler(helloworld_handler, NULL, NULL, APR_HOOK_MIDDLE);
}

This hooks function is itself part of the module object. For most modules, this is the only symbol exported and visible to other modules or the core:

module AP_MODULE_DECLARE_DATA helloworld_module = {
  STANDARD20_MODULE_STUFF,
  NULL,
  NULL,
  NULL,
  NULL,
  NULL,
  helloworld_hooks
};

And that's all! We have a complete HelloWorld module. We can use apxs, a compiler-wrapper that is part of the Apache installation, to compile and (as root) install it:

$ apxs -c mod_helloworld.c
# apxs -ie mod_helloworld.la

Well, as usual I'm over the 1000 words, so I'll bring this first look at modules to a close. For further information, stay tuned. If you're seriously interested, my book is now in production with the publisher, so for the first time you have a more than just the source code and a handful of ad-hoc materials to help upgrade your LAMP and application server skills!


Other stories you might like

  • James Webb Space Telescope has arrived at its new home – an orbit almost a million miles from Earth

    Funnily enough, that's where we want to be right now, too

    The James Webb Space Telescope, the largest and most complex space observatory built by NASA, has reached its final destination: L2, the second Sun-Earth Lagrange point, an orbit located about a million miles away.

    Mission control sent instructions to fire the telescope's thrusters at 1400 EST (1900 UTC) on Monday. The small boost increased its speed by about 3.6 miles per hour to send it to L2, where it will orbit the Sun in line with Earth for the foreseeable future. It takes about 180 days to complete an L2 orbit, Amber Straughn, deputy project scientist for Webb Science Communications at NASA's Goddard Space Flight Center, said during a live briefing.

    "Webb, welcome home!" blurted NASA's Administrator Bill Nelson. "Congratulations to the team for all of their hard work ensuring Webb's safe arrival at L2 today. We're one step closer to uncovering the mysteries of the universe. And I can't wait to see Webb's first new views of the universe this summer."

    Continue reading
  • LG promises to make home appliance software upgradeable to take on new tasks

    Kids: empty the dishwasher! We can’t, Dad, it’s updating its OS to handle baked on grime from winter curries

    As the right to repair movement gathers pace, Korea’s LG has decided to make sure that its whitegoods can be upgraded.

    The company today announced a scheme called “Evolving Appliances For You.”

    The plan is sketchy: LG has outlined a scenario in which a customer who moves to a locale with climate markedly different to their previous home could use LG’s ThingQ app to upgrade their clothes dryer with new software that makes the appliance better suited to prevailing conditions and to the kind of fabrics you’d wear in a hotter or colder climes. The drier could also get new hardware to handle its new location. An image distributed by LG shows off the ability to change the tune a dryer plays after it finishes a load.

    Continue reading
  • IBM confirms new mainframe to arrive ‘late’ in first half of 2022

    Hybrid cloud is Big Blue's big bet, but big iron is predicted to bring a welcome revenue boost

    IBM has confirmed that a new model of its Z Series mainframes will arrive “late in the first half” of 2022 and emphasised the new device’s debut as a source of improved revenue for the company’s infrastructure business.

    CFO James Kavanaugh put the release on the roadmap during Big Blue’s Q4 2021 earnings call on Monday. The CFO suggested the new release will make a positive impact on IBM’s revenue, which came in at $16.7 billion for the quarter and $57.35bn for the year. The Q4 number was up 6.5 per cent year on year, the annual number was a $2.2bn jump.

    Kavanaugh mentioned the mainframe because revenue from the big iron was down four points in the quarter, a dip that Big Blue attributed to the fact that its last mainframe – the Z15 – emerged in 2019 and the sales cycle has naturally ebbed after eleven quarters of sales. But what a sales cycle it was: IBM says the Z15 has done better than its predecessor and seen shipments that can power more MIPS (Millions of Instructions Per Second) than in any previous program in the company’s history*.

    Continue reading

Biting the hand that feeds IT © 1998–2022