The three secrets to optimal Drupal performance are cache, cache, and more cache. Every layer of the Drupal
server stack offers its own caching options, and you should familiarize yourself with how to take advantage of
all of them. Here’s a listof key areas toconsider as you look for opportunities to improve the performance of
your site:
PHP op code cache: Op code caching is critical and its importance can be understated. There is no good reason for not having an opcode cache other than if you happen to prefer having high server loads and slow page load
times. For PHP op code caches, your choices include APC, XCache, eAccelerator, etc., any of which can easily
be installed into your PHP environment. The best practice for opcode cache is APC (drupal.org/project/apc).
See Figure 23-1 for an example of a report generated by APC.
Reverse proxy cache: A reverse proxy cache takes a tremendous amount of load off your web servers. A proxy cache is a fast web server that sits in front of your back-end web servers, caching any cache able content
passing through it(as a write-through cache) so that subsequent web requests are served directly from the
proxy cache rather than from your back-end servers. I’ll talk about Varnish in a bit, the preferred
solution for reverse proxy caching.
Database caches: MySQL has its own built-in caches, particularly the query cache (query_cache_size) and file
system I/O cache (innodb_buffer_pool_size), which ought to be increased as high as your database server has
the memory available to do so.
Drupal caches: Drupal has its own caches for pages, blocks, and Views. Visit the Drupal performance page in
your Drupaladmin interface, and turn them all on. I’ll also talk about Press flow, an optimized version of Drupal
that improves on Drupal’s own internal caching mechanisms.
Figure 23-1. Alternative PHP Cache (APC) comes with an interface that displays
memory allocation and the files currently within the cache.
Often the system takes a performance hit when data must be moved to or from a slower device such as a hard disk drive.What if you could bypass this operation entirely for data that you could afford to lose (like session
data)? Enter memcached, a system that reads and writes to memory.
Memcached is more complicated to set up than other solutions proposed in this chapter, but it is worth talking
about when scalability enhancements are needed in your system.
Drupal has a built-in database cache to cache pages, menus, and other Drupal data, and the MySQL database is
capable of caching common queries, but what if your database is straining under the load? You could buy another database server, or you could take the load off of the database altogether
by storing
memcached/) and the PECL Memcache PHP extension (see http://pecl.php.net/ package/memcache) are just
the tools to do this for you.
The memcached system saves arbitrary data in random access memory and serves the data as fast as possible.
This type of delivery will perform better than anything that depends on hard disk access. Memcached stores
objects and references them with a unique key for each object. It is up to the programmer to determine what
objects to put into memcached. Memcached knows nothing about the type or nature of what is put into it; to
its eyes, it is all a pile of bits with keys for retrieval.
The simplicity of the system is its advantage. When writing code for Drupal to leverage memcached, developers can decide to cache whatever is seen as the biggest cause of bottlenecks. This might be the results of database
queries that get run very often, such as path lookups, or even complex constructions such as fully built nodes andtaxonomy vocabularies, both of which require many database queries and generous PHP processing to produce. A memcache module for Drupal and a Drupal-specific API for working with the PECL Memcache interface can
be found at http://drupal.org/project/memcache.
Optimizing PHP
On Apache servers, you have two ways to execute PHP code: Fastcgi (mod_fcgid, mod_fastcgi, or PHP-
FPM) or mod_php. The key difference between them is mod_php will execute PHP code directly in Apache, whereas the Fast cgi variants will passeach PHP request to an external php-cgi process, which executes PHP
outside of Apache and then pipes its output back to Apache.
On an Nginx web server (more about Nginx later in this chapter), the choice is made simpler because you’re
limited to using only the NginxHttpFcgiModule (Fastcgi), as Nginx does not have a built- in PHP interpreter
module such as mod_php.mod_php and the Fastcgi variants perform marginally the same—after all
they’re really using the same underlying PHP interpreter running the same PHP code underneath. The only key difference is where their inputs and outputs are being redirected. Unsurprisingly, benchmarking equally sized mod_php and Fastcgi process pools shows nearly the same server loads and Drupal delivery performance.
An Apache+mod_php process pool with 25 child processes and an Apache+Fastcgi process pool with 25 PHP processes will have the same overall memory footprint and performance characteristics. However, the Fastcgi
variants offer the option of sizing your PHP process pool independently from your Apache process pool, whilewith mod_php your pool of PHP interpreters is equal to the number of Apache processes. For this reason,
some may advocate a Fastcgi approach overmod_php because Fastcgi “saves memory.” This might be true if
you ignored APC opcode cache size considerations (also explained here) and you chose to restrict the total number of Fastcgi processes to be dramatically fewer than the number of Apache child processes. However,
severely limiting the size of your PHP process pool can severely bottleneck your PHP through put: that’d be
similar to closing three lanes of a busy four-lane highway for no better reason than to “save space” and there by cause traffic jams. There’s another important memory usage consideration: PHP’s APC opcode cache is
shared across mod_php processes (all mod_php processes refer to the same APC cache block), but APC
cache is not shared across php-cgi processes when using mod_fcgid.
Given that the typical size of an APC opcode cache for a Drupal server could be 50MB or more, this means
when using an APC opcode cache (as any reasonable Drupal server should), the entire process pool of Apache
and php-cgi processes will al together use a lot more memory than the same size pool of Apache and mod_php
processes. So which performs better? The answer is neither mod_php nor Fastcgi performs dramatically better
than the other when given the same amount of resources. However, you may consider using a Fastcgi option if
you want to tune your Apache process pool size differently than your PHP process pool, for other reasons,
such as on multi-tenant web servers, because Fastcgi offers user-level separation of processes.
Setting PHP Opcode Cache File to /dev/zero
Both APC and XCache offer an option to set the path of the opcode cache. In APC the path of cache storage,
the apc.mmap_file_mask setting, determines which shared memory mechanism it uses. System VIPC shared
memory is a decent choice but limited to only 32MB on most Linux systems, which can be raised, but by defaultit’s not enough opcode cache for typical Drupal sites. POSIX mmap shared memory can share memory blocks of any size; however, it performs quite poorly if that memory is backed by a disk file, as frequent shared memory
I/O operations will translate into large and frequent disk I/O operations, which is especially noticeable on slow
disks. The solution is to set your memory map path to /dev/zero, which tells mmap not to back the memory
region with disk storage. Fortunately APC uses this mode by default, unless you’ve explicitly set apc.mmap_
file_mask to any path other than /dev/zero.
PHP Process Pool Settings
By “PHP process pool” I’m referring to the entire PHP execution process pool on your web server, which
determines how many concurrent PHP requests your server can deliver without queuing up requests. The PHP process pool is managed either by Apache+mod_php or some variant of Fastcgi: mod_fcgid, mod_fastcgi, or
PHP-FPM (FastCGI Process Manager). The PHP process pool tuning considerations are as follows:
Run as many PHP interpreters as memory will allow. If you’re running mod_php, then your PHP pool size is
the number of Apache child processes, which is determined by the Apache config settings Start Servers, Min Spare Servers,Max Spare Servers, and Max Clients, which can all be set to the same amount to keep the pool
size constant. If you’re running a Fastcgi variant, such as mod_fcgid, then your PHP pool size Max Process Count, Default Max Class Process Count, and Default Min Class Process Count, should all be set to the same
amount to keep the pool size constant. For an 8GB web server, you may try setting your PHP process pool
size to 50, then loadtest the server by requesting many different Drupal pages with a user client concurrency of
50, and set the think time between page requests of least 1 second per client. If the server runs out of memory and/or begins to scrape swap space, then decrease the number for PHP process pool size and try again. Server
load may inevitably climb during such a load load test, but it’s not an issue to be concerned with during this
tuning test. Keep as many idle PHP interpreters hanging around for as long as possible.You want to avoid
churning your PHP process pool, which means to avoid constantly reaping and re-spawning PHP interpreters
in response to the web trafficload of the moment.
Instead it’s better to create a constant-size pool of PHP interpreters, as many as your server memory can hold,
and have that pool size remain constant even if most of those processes are idle most of the time. Formod_php you’ll want to set Apache’s Start Servers, Min Spare Servers, Max Spare Servers, and Max Clients all equal to
each other, in which case 50 is a decent starting value for an 8GB Drupal web server. This creates a constant-sizepreforked pool of Apache+mod_php processes. The other key Apache setting for mod_php is Max Requests Per Child, which ideally you will want to set at 0 so that Apache does not re-spawn child processes.
But if your web server slowly leaks memory over time, and you strongly suspect mod_php is leaking memory,then you may set Max Requests Per Child to 10000 or more, and then dial it down until the memory leak
issue is under control.
For mod_fcgid, if you’re experiencing a php-cgi segfault on every 501st PHP request (a known bug in mod_
fcgid,which may have already been addressed as of this writing), then you will have to set Max Requests Per Process to 500,which will force each php-cgi interpreter to re-spawn itself every 500 requests. Other wise, set
mod_fcgid Max Requests Per Process to 0 unless php-cgi processes are leaking memory. Also for mod_fcgid,
set IdleTimeout and IdleScanInterval to several hours or more to avoid the overhead of re-spawning PHP inter
preters on demand.
Tuning Apache
There are several configuration parameters that will help speed the execution of requests for Drupal sites running on an Apache web server. Some of the biggest improvements can be made through the following
recommendations.
mod_expires
This Apache module will let Drupal send out Expires HTTP headers, caching all static files in the user’s browser for two weeks oruntil a newer version of a file exists. This goes for all images, CSS and JavaScript files, and other static files. The end result isreduced bandwidth and less traffic for the web server to negotiate. Drupal is pre configured to work with mod_expires and will use it if it is available. The settings for mod_expires are
found in Drupal’s .htaccess file.
# Requires mod_expires to be enabled.
<IfModule mod_expires.c>
# Enable expirations.
ExpiresActive On
# Cache all files for 2 weeks after access (A).
ExpiresDefault A1209600
<FilesMatch \.php$>
# Do not allow PHP scripts to be cached unless they explicitly send cache
# headers themselves. Otherwise all scripts would have to overwrite the
# headers set by mod_expires if they want another caching behavior. This may
# fail if an error occurs early in the bootstrap process, and it may cause
# problems if a non-Drupal PHP file is installed in a subdirectory.
ExpiresActive Off
</FilesMatch>
</IfModule>
We can’t let mod_expires cache PHP-generated content, because the HTML content Drupal produces is not
always static. This is the reason Drupal has its own internal caching system for its HTML output (i.e., page
caching).
Moving Directives from .htaccess to httpd.conf
Drupal ships with two .htaccess files: one is at the Drupal root, and the other is automatically generated after
you create your directory to store uploaded files and visit Configuration -> File system to tell Drupal where
the directory is. Any .htaccess files are searched for, read, and parsed on every request. In contrast, httpd.conf
is read only when Apache is started. Apache directives can live in either file. If you have control of your own
server, you should move the contents of the .htaccess files to the main Apache configuration file (httpd.conf)
and disable .htaccess lookups within your web server root by setting AllowOverride to None:
<Directory /> AllowOverride None
...
</Directory>
This prevents Apache from traversing up the directory tree of every request looking for the .htaccess file to
execute. Apache will then have to do less work for each request, giving it more time to serve more requests.
MPM Prefork vs. Apache MPM Worker
The choice of Apache prefork vs. worker translates into whether to use multiple Apache child processes or fewer child processes, each with multiple threads. Generally for Drupal, the better choice is Apache prefork. Here’s
why:
PHP is not thread-safe, so if you’re using mod_php, then your only real choice is Apache prefork. If you’re
using Fastcgi (such as mod_fastcgi or mod_fcgid), then you could use Apache MPM worker because PHP
requests would be handled externally from Apache. However, using Apache MPM worker instead of Apache
MPM prefork is still not the big win that some make it out to be because there’s nothing magical about threads
that makes a multithreaded application automatically faster and more scalable than a preforked multi process
equivalent, even on multi-core systems, and this is for a few reasons:
First, it helps to demystify what threads really are to a Linux operating system: threads are mostly the same as child processes. What distinguishes a thread from a child process is that a thread has direct shared access to the memory contents of its parent process, whereas a forked child process gets a copy-on-write reference to the
memory contents of its parent process. This distinction offers a slight performance advantage to threads, which is then easily squandered on the often complex logistics of synchronizing shared memory access between threads.
Second, the perception that threads use significantly less memory than separate child processes is not as it seems.Using common system tools such as top and ps, it seems as though each Apache child process is using almost
as much memory as its Apache parent process. In fact, most of the memory footprint of each Apache child
process is the same exact memory regions used by the Apache parent process being repeatedly counted multiple
times. This is because most of the memory footprint of child processes is thecontents of shared libraries, which
most operating systems are smart enough to load into memory once, and every additional process using those
same libraries refers to the first shared copy in memory. Another memory usage consideration is child processeswill share most of the memory contents of its parents unless it modifies those
contents(copy-on-write).
Third, you can kill runaway Apache child processes, but you can’t kill runaway Apache threads without
restarting all of Apache. From a server admin perspective, it’s easier to diagnose and address problems in a
prefork Apache process pool than a threaded Apache process pool. Of course, your mileage may vary, so benchmarking different Apache MPM configurations is still a worthy exercise.
Balancing the Apache Pool Size
When using Apache prefork, you want to size your Apache child process pool to avoid process pool churning.In other words, when the Apache server starts, you want to immediately prefork a large pool of Apache
processes (as many as your web server memory can support) and have that entire pool of child processes
present and waiting for requests, even if they are idle most of the time, rather than constantly incurring the
performance overhead of killing and re-spawning Apache child processes in response to the traffic level of the
moment. Here are example Apache prefork settings for a Drupal web server running mod_php.
StartServers 40
MinSpareServers 40
MaxSpareServers 40
MaxClients 80
MaxRequestsPerChild 20000
This is telling Apache to start 40 child processes immediately, and always leave it at 40 processes even if trafficis low, but if traffic is really heavy, then burst up to 80 child processes. (You can raise the 40 and 80 limits
according to your own server dimensions.) You may look at this and ask, “Well, isn’t that a waste of memory
to have big fat idle Apache processes hanging about?” But remember this:the goal is to have fast page delivery, and there is no prize for having a lot of free memory. “My server is slow, but look at all that free RAM!!!” If you have the memory, then use it!
Decreasing Apache Timeout
The Timeout setting in the Apache config determines how long a web client can hold a connection open
without saying anything. Apache’s default Timeout is 5 minutes (300 seconds), which is far too polite. Decrease Apache’s Timeout to 20 seconds or less.
Disabling Unused Apache Modules
Comment out any Apache LoadModules if it is certain they’re not needed. Such candidates include
mod_cgi, mod_dav, and mod_ldap.
Using Nginx Instead of Apache
The more adventurous LAMP admins are substituting Apache with Nginx. Nginx is an excellent general- purpose server with massive scalability. However, Nginx does not support mod_php—rather, you’re limited to using
Fastcgi (php-cgi) to serve PHP requests, which is not a bad choice, just different. Also Nginx does not
comprehend Apache htaccess files, so you’ll have to translate any htaccess-specific directives in your Drupal
code base, such as Boost cache, into equivalent Nginx configuration directives. As for which is faster,
many would argue in favor of Nginx. But the real bottleneck in any Drupal stack is going to be the PHP or
database layer rather than the choice of web server. Nonetheless, Nginx’s strengths make it a good fit as a load
balancer (seeits http upstream module) and static content server.
Using Pressflow
Pressflow is a drop-in replacement of the standard Drupal core, including many performance enhancements
over and above Drupal core. Other wise, from all outward appearances,Press flow is entirely the same as Drupal.
Many of Pressflow’s features continue tomake their way into the Drupal core; however, the folks at Four Kitchens continue to push the envelope when it comes to optimizing Drupal. At the time this book was written, there
wasn’t an official release of Pressflow for Drupal 7. For up-to-date information on the features and functionality
incorporated into Pressflow, visit www.pressflow.org.
Varnish
Varnish is becoming the darling proxy cache server of the Drupal community. Varnish is a fast and powerful
HTTP reverse proxy cache server. A typical Drupal app server may be capable of delivering hundreds of
dynamic Drupal pages per minute. Varnish offers the ability to deliver thousands of cached Drupal pages per
second! And furthermore, requests
served from Varnish generate no load on your back- end servers because
the cache-delivered requests never reach your back-end servers.
In a typical setup, Varnish is installed to listen on port 80 (the standard web server listening port) so that all webcontent requests hitVarnish first. Varnish decides whether to serve the request directly from its own cache or
echo the request back to back-end web servers. The cache and delivery policies are expressed in the local VCL
(Varnish Configuration Language) configuration file. VCL offers Varnish admins the ability to set very specific
cache policies using conditional expressions resembling Javascript. VCL also offers the ability to load balance
requests across many back- end servers, rewrite requests, change the content of requests, and block requests.
Furthermore, VCL language offers the ability to include inline C language for those wanting to manipulate the
request delivery process at the lowest levels possible. Note that Varnish does not support SSL (HTTPS
requests) and does not offer separate virtual host configurations in a shared hosting environment; however, in
Varnish VCL expressions can be bracketed inside a conditional based on the target host of the request. It’s also worth noting that Varnish is an HTTP write-through cache and not a generic key/value store, and so it’s not a
substitute for memcached nor does it offer a direct API for storing and fetching arbitrary data from cache. Other HTTP proxy cache alternatives include Squid, Apache with mod_cache, and Nginx’s http proxy cache module;
however, these options don’t offer the richness of Varnish’s VCL language. Worth noting is that Varnish is multi
threaded, so its scalability is limited to how many Varnish server threads your server can juggle at once. A moderately busy Varnish server may have a few hundred threads running, and a very busy Varnish can peak at justover a thousand threads. If your Varnish is not able to spawn more threads, then additional requests to your web site will be met with “Connection reset” errors. To allow Varnish to spawn more threads, edit the Varnish start
up scripts to adjust the -w options (worker thread pool options) passed to the Varnish start command.
The second parameter passed into the -w option is the maximum number of threads Varnish can spawn.
Increase that setting to at least 4000.
Secondly, on Linux systems, each thread is allocated 8MB of virtual memory by default, which is far more than
any Varnish thread will require. So in your Varnish startup script, you’ll want to add the command “ulimit -s 512” to decrease the default stack space per thread to 512KB.
Normalizing incoming requests for better Varnish hits
The key to achieving good Varnish cache hits rates is to normalize the incoming HTTP requests so that all
anonymous requests for the same URL get the same cache hit from Varnish. To understand Varnish cache
coherency you must first understand how Varnish stores cache entries for each URL. Varnish combines the
following incoming request attributes into a hash key which it usesto store and lookup its cache entries:
request URL
incoming Host header incoming Cookie header
incoming Accept-Encoding header
The issue here is that the Cookie header and the Accept-Encoding header vary from browser to browser. For
example, it is highly likely that the variety of browsers hitting your web site have different cookies and thus different
Cookie headers. To address the variance of incoming Cookie headers you'll want to (at best) remove the entire
incoming Cookie header during the vcl_recv phase of your Varnish config, like so:
sub vcl_recv {
# Remove the incoming Cookie header from anonymous requests if (req.http.Cookie !~ "(^|;\s*)SESS") {
unset req.http.Cookie;
}
# ... other vcl_recv rules here ...
# Don't serve cached content to logged-in users if(req.http.cookie ~ "SESS") {
return(pass);}
# Attempt to serve from cache return(lookup);}
The above VCL snippet checks if the request is from a logged-in user (one that has a cookie starting with
"SESS") and if it not then normalizes the Cookie header by removing it altogether. If there is a need to have
some cookies from anonymous request echoed to your back end servers then you can adjust the Cookie regex
or add a few more lines to be more selective about which cookies ought to miss the Varnish cache lookup
pahse. The other incoming request header that needs to be normalized is Accept-Encoding because it varies
slightly across different web browser types. The most common use of the Accept-Encoding header if for
the web browser to communicate to the web server that the browser can receive compressed content. The
typical VCL snippet to normalize the Accept-Encoding looks like this:
# Normalize Accept-Encoding to get better cache coherency if (req.http.Accept-Encoding) {
# No point in compressing media that is already compressed
if (req.url ~ "\.(jpg|png|gif|gz|tgz|bz2|tbz|mp3|ogg)$") { remove req.http.Accept-Encoding;
# MSIE 6 JS bug workaround
} elsif(req.http.User-Agent ~ "MSIE 6") { unset req.http.Accept-Encoding;
} elsif (req.http.Accept-Encoding ~ "gzip") { set req.http.Accept-Encoding = "gzip";
} elsif (req.http.Accept-Encoding ~ "deflate") { set req.http.Accept-Encoding = "deflate";
} else {
# unkown algorithm
remove req.http.Accept-Encoding;
}
}
Varnish: finding extraneous cookies
The following command line on your Varnish server is useful for watching live incoming Cookie headers that
being echoed from Varnish to your backend servers.
varnishlog | grep TxHeader | grep Cookie
This is useful for adjusting how the Cookie header is filtered in Varnish.
Boost
The popular Boost module for Drupal (http://drupal.org/project/boost) essentially builds a static file cache for
dynamically generated Drupal content. With the Boost module installed in Drupal, whenever Drupal generates a
dynamic page, Boost will save a static copy of that content so that the next anonymous request for that same
page will be delivered from the Boost cache. A background cron process periodically culls outdated pages from the Boost cache, which are then regenerated on the next request. This approach reduces overall PHP and
MySQL overhead but still requires Apache (or Nginx, IIS, lighthttpd) to process a few extra rewrite rules for
each page request. The key to good Boost performance is to put the Boost cache directory on a fast local file
system. Some Drupal admins may consider writing Boost cache files into a shared network file system so that
many web servers can share the same Boost cache files; however, a busy web site can have a lot of file
system I/O arise from Boost cache maintenance, so much so that a network shared file system slows down
considerably, in which case the Boost cache ought to be a local directory on each web server instead.
If each web server has extra memory but slow disks, then you may also consider writing your Boost cache
files to a local ramfs file system, which is a feature of Linux that allows you to create an ephemeral storage
volume that exists entirely in RAM.
Boost vs. Varnish
Although Boost and Varnish are different kinds of caching solutions, Drupal administrators often weigh these twooptions directly against each other. In general Boost is easier to set up and administer than Varnish. However,
Varnish offers a general solution to better performance as it can be used to proxy cache other kinds of content,
such as static images and stylesheets, and not just Drupal pages. Varnish also offers the ability to load balance and rewrite requests before they even reach your web server, whereasBoost requests are still hitting the web
server. However, it’s also possible to use Boost and Varnish together. You may just need to tune your HTTP
cache expiration headers and Boost cache purging so that Varnish and Boost are refreshing their caches in a
timely manner.
Không có nhận xét nào:
Đăng nhận xét