Tips for Improving Web Server Performance (Part I)

We hear a lot about cloud infrastructure and scaling out, but the truth is that no matter how much hardware and infrastructure (be it physical or virtual) you have, the bottleneck usually lies in application or database setup and configuration. In this article, I’m going to outline the general directions for optimizing web server performance. In following articles I will follow up with more detailed suggestions. I will assume you’re using Apache HTTPD as a web server. Most of the optimizations paths outlined here are also valid for other web servers (nginx, lighttpd), although they might come under different names and flavors.

Here’s the stuff you should check:

1. Make sure you have KeepAlive enabled.

This is done by setting

KeepAlive On

This setting allows you to prevent network overhead for establishing a new connection for every single HTTP request. With KeepAlive set to On, the browser just establishes one connection with the web server and than it transfers all the request/response pairs over that connection. If KeepAlive were Off, the browser would have to reestablish the connection for every single request. As a ball park estimate, the time it takes to set up a connection is equivalent to the ping time between the client machine and the server machine (5-50ms). Furthermore, you can estimate that setting KeepAlive On will shave up approximately

Time Saving = (Number of Request per Page) * (Ping Time) / (Parallel Connections from Browser)

Most browsers set up 6 parallel connections. Considering the ping time around 20ms for a page with 50 requests, that adds up to 20*50=166 ms shaved off every page with changing just one line of configuration.

Also, be careful with KeepAliveTimeout. This defines for how long a TCP/IP connection is being kept open. The downside of setting this parameter too high is that slow clients will unnecessarily keep resources blocked on your server (memory, file descriptors). The default value for this is 15 seconds. As a rule of thumb, don’t go below 5 seconds or above 20 seconds.

2. Make sure your Apache HTTPD process is running mod_worker (instead of mod_prefork).

Apache can naturally serve more requests at the same time. But there is additional performance to be leveraged depending on how those parallel requests are served.

mod_prefork essentially means that Apache spawns several httpd processes, with each process serving one single request at a time.

mod_worker means that Apache spawns several httpd processes, with each process having several threads capable of serving one request at time.

The main difference is that processes are more expensive than threads. They are more expensive in memory (as the entire process needs to be duplicated, instead of just the thread stack) and they are more expensive for the CPU (which spends more time switching between process contexts than between thread contexts).

Before you read on, you should know that mod_worker is not naturally compatible with all Apache extensions, especially with the all popular mod_php. This happens because mod_php is not thread safe – that is, it’s built for running in its own process, not in a thread. To go around this limitation, you should use FastCGI and compile PHP to include FastCGI libraries.

In order to run Apache httpd in mod_worker, you need to modify the file

/etc/sysconfig/httpd

by adding (or removing the # from) the following line:

HTTPD=/usr/sbin/httpd.worker

If you’re using PHP, you might get the following error:

Apache is running a threaded MPM, but your PHP Module is not compiled to be threadsafe. You need to recompile PHP.

In this case, see the above link about using FastCGI to allow PHP to run in its own memory space.

3. Fine tune your Multi-Processing Module

Activating mod_worker as your MPM is a good start, but it’s not enough. The number of threads and processes greatly depends on your configuration, especially on the number of CPU/vCPU cores the machine has.

Below you’ll find an example of a MPM worker configuration. Please note that this isn’t a universal solution. You should also read the explanation from Apache here and here and always do you own testing.

<IfModule worker.c>
ServerLimit        4
StartServers         2
MaxClients         128
MinSpareThreads     15
MaxSpareThreads     35
ThreadsPerChild     32
MaxRequestsPerChild  1000
</IfModule>

Let’s see what every directive means

  • ServerLimit is the maximum number of httpd processes that will run.
  • StartServers is the number of httpd processes that are first started when you run Apache.
  • MaxClients are the maximum number of clients which are served at any one point, simultaneously.
  • MinSpareThreads, MaxSpareThreads are the minimum and maximum number of idling threads which are to be available at any one point.
  • ThreadsPerChild defines how many threads each process hosts. 16 or 32 is a good idea and you probably shouldn’t go above that.
  • MaxRequestsPerChild is the maximum number of HTTP connections which are being served by a process before respawning (killing and restarting) a httpd process. This should probably lie between 1000 and 10000 requests. It is a useful feature for limiting memory thrashing, leaks or other related performance degradation issues. Note that having KeepAlive On makes httpd count the number of connections (which can contain several requests), not the number of requests.

You should consider these following rules of thumb:

  • MaxClients < ServerLimit*ThreadsPerChild
  • Make sure that ServerLimit < ThreadsPerChild . Always consider scaling out the number of threads before the number of processes. The reason for this is that process switching is more CPU-expensive than threads switching (mostly because threads share the same memory space)
  • ServerLimit should be at least equal to the number of cores, but no larger than four times the number of cores. I recommend going for two times the number of cores (unless your tests show different results).
  • StartServer should be half of ServerLimit.
  • ThreadsPerChild should be between 16 and 32 requests.
  • Unless you have good reason (your own tests), don’t exceed 32 threads per process.

So there you have it.

Always remember: your own research and your own tests on your own scenarios are always more relevant than what I or anyone else advise.

In Part II of the article, we’ll get into details about caching:

4. Mark all static resources as cacheable

5. Leverage mod_disk_cache and ramfs to achieve in memory caching

6. Leverage database query caching

In the last part of the article, we’ll be covering:

7. Minify and join JS and CSS files.

8. Use CSS image sprites as much as possible

9. Precompile PHP files

10. Mount pid files on ramfs

VisageCloud – Face recognition meets Big Data

Bogdan Written by:

Be First to Comment

Leave a Reply

Your email address will not be published. Required fields are marked *