2.1 Servlet And Container

What if you had Java, but no servlets or containers? Well, you can still use Java SE to handle the HTTP request. But, it would require overwhelming efforts. Basically, some key functions you would have to implement in plain old Java on your own if no container existed:

  • Create a socket connection with the server, and create a listener for the socket.
  • Create a thread manager.
  • Implement security.
  • Convent a JSP to a Servlet.
  • ...

Thanks to the container. You get to concentrate more on your own business logic instead of worrying about writing code for threading, security, and networking.

[!TIP] Tomcat is the container we are using in the textbook.

How the container handles a request

As you have already known, String is a Java class used to manage a sequence of characters; Date is a Java class used to provide the ability to manipulate time; Math is Java class used to extend the capabilities that handle mathematical operations...

What is Servlet essentially? From the point of view of code, a servlet is no exception: it is a Java class that is used to extend the capabilities of servers that host applications accessed by means of a request-response programming model.

[!TIP] A servlet is a small Java program that runs within a Web server[1].

public class HelloServlet extends HttpServlet {
    ...
    public void doGet(HttpServletRequest request, HttpServletResponse response) {
        ...
    }
}

Let's take a closer look at the HelloServlet.java. As the name implies, the doGet() method is to handle the HTTP GET request and then make a response. In the plain old Java code, there must another Java class to create an instance of HelloServlet, HttpServletRequest and HttpServletResponse. Without any doubt, it is the container's responsibility. The pseudo code[2] can be described as following:

// what happens in the container
HelloServlet servlet = new HelloServlet();
HttpServletRequest request = new HttpServletRequest();
HttpServletResponse response = new HttpServletResponse();
servlet.doGet(request, response);

Here's a quick overview about how the container handles a request, and it illustrates what happens behind the scene of Fig.1.12. We assume that the request is an HTTP GET.

  • Step 1: User clicks a link that has a URL to a servlet instead of a static page. The clicking behavior would generate an HTTP request sent to the container.

Figure 2.1 Step 1: User sends a request.

  • Step 2: The container "sees" that the request is for a servlet, so the container creates two objects: 1) HttpServletResponse and 2) HttpServletRequest.

Figure 2.2 Step 2: Container creates request and response.

  • Step 3: The container finds the correct servlet based on the URL in the request, creates or allocates a thread for that request, and passes the request and response objects to the servlet thread[3]. Note that different accesses to the same servlet are isolated threads.

[!TIP] We will revisit the thread related issues in Section 3.1.

Figure 2.3 Step 3: Container passes request and response to servlet thread.

  • Step 4: The container calls the servlet's doGet(), which generates a dynamic page and dumps the page into the response object. Remember, the container still has a reference to the response object.

Figure 2.3 Step 4: Container calls doGet().

  • Step 5: The thread completes, and the container converts the response into an HTTP response, sends it back to the client, then deletes the request and response objects[4].

Figure 2.3 Step 5: Container sends back HTTP response.

What makes servlet a servlet

Back to the servlet itself, let's revisit how it looks in code line by line (ch1/HelloServlet.java).

// lines 1-6
package com.swufe.javaee.first_java_ee;

import java.io.*;

import javax.servlet.http.*;
import javax.servlet.annotation.*;

Every Java programmer should know what package and import are: package is to organize Java files into different modules or folders in your file systems, and importing a package allows you to access classes in the package in current Java file. Here java.io.* is the package of Java SE (under java namespace), while the remaining two belong to Jakarta EE (under jakarta namespace).

[!NOTE] In Java EE 8 or below, the namespace is javax, and it has been changed to jakarta since Jakarta EE 8.

We leave the explanation of line 8 for the next subsection How the container finds the correct servlet.

// line 9
public class HelloServlet extends HttpServlet {

Here we have a Java class named HelloServlet extending jakarta.servlet.http.HttpServlet, so this self-defined class can reuse many methods as well fields from HttpServlet. As we mentioned above, the main purpose of a servlet is to handle requests and then make responses. Although there are other alternative communication protocols (e.g., FTP, SMTP), Jakarta EE only provides supports for HTTP(S) which are the most widely used in web era. As its name implies, HttpServlet is to used to create an HTTP servlet suitable for a Web site.

[!NOTE] 99.999% of all servlets are HttpServlets.

What about the next 0.001%? Well, in rare cases, you can even implement your own servlet to handle other network protocols in addition to HTTP(S). The following is a simple UML class diagram[5] to illustrate the servlet families.

Figure 2.4 Servlet class diagram.

As we can see, Servlet is an interface, and GenericServlet, an abstract class, is protocol-independent servlet. Implementation in UML is a hollow triangle shape on the interface end of the dashed line (----▻). HttpServlet is also an abstract class extending GenericServlet. Inheritance in UML is a hollow triangle shape on the superclass end of the line. It is important to understand the class diagram in a system if you want to have a quick overview in a high level.

// lines 16-24
public void doGet(HttpServletRequest request, HttpServletResponse response) throws IOException { // 16
    response.setContentType("text/html"); // 17
    String message = "Hello World!"; // 18
    // Hello
    PrintWriter out = response.getWriter(); // 20
    out.println("<html><body>"); // 21
    out.println("<h1>" + message + "</h1>"); // 22
    out.println("</body></html>"); // 23
}

Slightly different from HelloServlet.java, we move the message variable into doGet() for the ease of explanation. First of all, doGet() is inherited from HttpServlet class, and it is used to handle HTTP GET method. To put it in another way, it is a method overriding, and to make it explicit, it is recommended to add @Override annotation before the method signature. As we discussed before, both request and response are created by the container, and they are interfaces to provide request and response information for HTTP servlets, respectively.

  • Line 17: setContentType() is a method of HttpServletResponse, specifying the content-type, a.k.a MIME type of a response. You may refer to Section 1.4 if you are lost.
  • Line 19: Generally speaking, there are two kinds of data types, character text (e.g., .html, .txt) and non-plain text (e.g., .jar, .png), respectively, and getWrite() method of a response returns a PrinterWriter object that can send character text to the client.
  • Lines 21-22: Write some HTML elements into the PrinterWriter. Different from System.out.println(), which is to display some texts into the standard output[7], println() of java.io.PrinterWriter is to display texts into a a text-output stream[8], and you can simply image that a response object wraps a text stream.

Figure 2.5 Response and PrinterWriter.

After doGet() is called, the container will convert this response object wrapped with contents into a real HTTP response and send it back to the client (i.e., a web browser in our case).

How the container finds the correct servlet

In a real system, there is a large number of servlets. So, a natural question is: how does the container the correct servlet?

Consider you are writing a letter to your friend Bob, which information can help postman deliver your letter to Bob? Well, you may asked to write down Bob's address and name in the envelope. Similarly, the container also needs each servlet's "address" and "name". Recall in the Step 3 of How the container handles a request:

The container finds the correct servlet based on the URL in the request.

The "address" (or "name") of a servlet corresponds to the resource name in a URL that people input in the address bar of a web browser.

Figure 2.6 URL anatomy.

In the last subsection, we omit line 8 of HelloServlet.java on purpose.

// line 8
@WebServlet(name = "helloServlet", value = "/hello-servlet")

It is a Java annotation introduced by Jakarta EE, which make it possible to map URLs to servlets[9], and it must be located before the class declaration. The key component of this annotation is value = "/hello-servlet", implying the resource named hello-servlet will be routed to this servlet, while name = "helloServlet is optional, and it does not really make sense in this example. You can also use urlPatterns to specify such mapping:

@WebServlet(urlPatterns = "/hello-servlet")

[!TIP] Both value and urlPatterns in @WebServlet can map URLs, and urlPatterns is more powerful. More rules and usages will be covered later in this book.

It is also fine to map several URLs to this servlet by specifying a list of values in value or urlPatterns.

@WebServlet(urlPatterns = {"/a", "/b"})

Then you can access this servlet via either /a or /b. By the way, like variable naming, you should provide meaningful values for servlet, so Since url mapping in the most important information of a servlet, there is a shorthand:

@WebServlet("/c")

Once upon a time: A servlet's name

Before Servlet API 3.0, setting up the URL mappings is a bit overwhelming, and you have to use the deployment descriptor (DD) to tell the container how to run your servlets and JSPs. DD is a fairly simple XML document (src | main | webapp | WEB-INF | web.xml)[10], and annotations can replace equivalent XML configuration in DD such as servlet declaration and servlet mapping.

First of all, let's have glance at the web.xml. Basically, its syntax is similar with HTML as we have studied previously. The default DD is nearly empty, except a root element <web-app>. DD provides a "declarative" mechanism for customizing you web applications without touching source code, and this flexibility is sometimes preferred because any changes of source code will result in extra re-compiling (i.e., converting .java to .class) and re-packaging.

There is no essential difference between annotation and DD in terms of servlet mapping. The main idea is the same: create a connection between a servlet and a URL. When it comes to a servlet, it is a Java class with a fully-qualified name (i.e, package name + class name). For example, HelloServlet's fully-qualified class name in ch2 is com.swufe.javaee.ch2.HelloServlet. Firstly, you have to map an internal name, which is optional in annotations but necessary in DD, to the fully-qualified class name using <servlet> element. Note that internal name is only visible to developers not to end users.

<servlet>
    <servlet-name>Some Name</servlet-name>
    <servlet-class>com.swufe.javaee.ch2.HelloServlet</servlet-class>
</servlet>

As we can see, <servlet-name> and <servlet-class>, nested inside <servlet>, are used to specify the internal name and fully-qualified name, respectively.

Next, you also have to further map the internal name to URLs using <servlet-mapping> element. The value inside <servlet-name> is what we have defined in <servlet>, and <url-pattern> has the same functionality with the value or urlPatterns in annotations.

<servlet-mapping>
    <servlet-name>Some Name</servlet-name>
    <url-pattern>/hello-servlet</url-pattern>
</servlet-mapping>

To sum up, there are THREE names of a servlet, and servlet-name is the bridge mapping the servlet to URLs:

  • servlet-name: the internal name, which is logical.
  • servlet-class: the fully-qualified class name, which is physical.
  • url-pattern: the URL name, which is visible to end users.

Figure 2.6 Three names of a servlet.

Clearly, annotations make our lives easier, and they are also widely used in many popular frameworks, including Java's Spring and Python's Flask, so it is recommended to use annotations if possible.


[1] In linguistics, -let often means small. For example, booklet is a small book or group of page; tablet is a small, solid piece of medicine. So servlet, literally, means a small program running in the server.

[2] The pseudo code doesn't really exist in Jakarta EE, and it only serves for the illustration purpose here.

[3] Process means any program is in execution, and thread means segment of a process. Since the servlet may be accessed by thousands of people simultaneously, there are many threads to create servlet instances. Creating threads can be a time consuming task, so many systems would maintain a thread pool, and therefore when a new request comes, an existing thread and instance will be allocated and reused for better performance.

[4] All objects in Java are references, so when passing an object to a function, any changes of the object can be observed by anyone who holds a reference to that object.

[5] Deletions are happened in an inexplicit way, and they are under the control of Java's garbage collection.

[6] UML, short for Unified Model Language, is a general-purpose, developmental, modeling language in the field of software engineering that is intended to provide a standard way to visualize the design of a system. A class diagram in UML is a type of static structure diagram that describes the structure of a system by showing the system's classes, their attributes, operations (or methods), and the relationships among objects. However, the class diagram used in this textbook is a simplified version, not the standard one.

[7] That default destination of the standard output is the display screen on the computer that initiated the program.

[8] I/O and networking operations are often abstracted as streams. Like water flow in a stream, we often emphasize data flow in program which has a destination.

[9] Such mapping from URLs to codes (i.e., methods, classes) is often called routing.

[10] XML, short of Extensible Markup Language, is similar to HTML, but without predefined tags to use. Instead, you define your own tags designed specifically for your needs.