Web Site Optimization - GZip compression

HTTP compression, otherwise known as content encoding, is a simple, effective way to save bandwidth and speed up your site. It improves the performance and highly decreases its loading time.

The mechansm is a publicly defined way to compress textual content transferred from web servers to browsers. HTTP compression uses public domain compression algorithms, like gzip and compress, to compress XHTML, JavaScript, CSS, and other text files at the server. This standards-based method of delivering compressed content is built into HTTP 1.1, and most modern browsers that support HTTP 1.1 support ZLIB inflation of deflated documents. In other words, they can decompress compressed files automatically.

Before we start we should explain what content encoding is. When you request a file like http://www.faresweb.net/index.html, your browser talks to a web server. The conversation goes a little like this:

 

HTTP Request

1. Browser: GET me /index.html
2. Server: Ok, let me see if index.html is laying around...
3. Server: Found it! Here's your response code (200 OK) and I'm sending the file.
4. Browser: 100KB? Ouch... waiting, waiting... ok, it's loaded.

Well, the system works, but it's not that efficient. 100KB is a lot of text, and frankly, HTML is redundant. Every <html>, <table> and <div> tag has a closing tag that's almost the same. Words are repeated throughout the document. Any way you slice it, HTML is not lean.

And what's the plan when a file's too big? Zip it!

If we could send a .zip file to the browser (index.html.zip) instead of plain old index.html, we'd save on bandwidth and download time. The browser could download the zipped file, extract it, and then show it to user, who's in a good mood because the page loaded quickly.

1. Browser: GET index.html? I'll take a compressed version if you've got it.
2. Server: Let me find the file... yep, it's here. And you'll take a compressed version? Awesome.
3. Server: Ok, I've found index.html (200 OK), am zipping it and sending it over.
4. Browser: Great! It's only 10KB. I'll unzip it and show the user.

The formula is simple: Smaller file = faster download = happy user.

The tricky part of this exchange is the browser and server knowing it's ok to send a zipped file over. The agreement has two parts:

  • The browser sends a header telling the server it accepts compressed content (gzip and deflate are two compression schemes): Accept-Encoding: gzip, deflate
  • The server sends a response if the content is actually compressed: Content-Encoding: gzip

If the server doesn't send the content-encoding response header, it means the file is not compressed (the default on many servers). The "Accept-encoding" header is just a request by the browser, not a demand. If the server doesn't want to send back compressed content, the browser has to make do with the heavy regular version.

Setting up the server

The "good news" is that we can't control the browser. It either sends the Accept-encoding: gzip, deflate header or it doesn't.

The server must be configured so it returns zipped content if the browser can handle it, saving bandwidth for everyone.

In Apache, enabling output compression is fairly straightforward. Add the following to your .htaccess file:

# compress text, html, javascript, css, xml:
AddOutputFilterByType DEFLATE text/plain
AddOutputFilterByType DEFLATE text/html
AddOutputFilterByType DEFLATE text/xml
AddOutputFilterByType DEFLATE text/css
AddOutputFilterByType DEFLATE application/xml
AddOutputFilterByType DEFLATE application/xhtml+xml
AddOutputFilterByType DEFLATE application/rss+xml
AddOutputFilterByType DEFLATE application/javascript
AddOutputFilterByType DEFLATE application/x-javascript
# Or, compress certain file types by extension:
<Files *.html>
SetOutputFilter DEFLATE
</Files>

Apache actually has two compression options:

  • mod_deflate is easier to set up and is standard.
  • mod_gzip seems more powerful: you can pre-compress content.

Deflate is quick and works, so I use it; use mod_gzip if that floats your boat. In either case, Apache checks if the browser sent the "Accept-encoding" header and returns the compressed or regular version of the file. However, some older browsers may have trouble (more below) and there are special directives you can add to correct this.

If you can't change your .htaccess file, you can use PHP to return compressed content. Give your HTML file a .php extension and add this code to the top:

In PHP:

<?php if (substr_count($_SERVER['HTTP_ACCEPT_ENCODING'], 'gzip')) ob_start("ob_gzhandler"); else ob_start(); ?>

We check the "Accept-encoding" header and return a gzipped version of the file (otherwise the regular version). This is almost like building your own webserver (what fun!). But really, try to use Apache to compress your output if you can help it. You don't want to monkey with your files.

What about Joomla!

Some applications have internal support to compress their pages. For example, in Joomla you can turn on the gZIP compression from Global Configuration > Server > gZIP Page Compression set to ON.

Cautions

Compressing content on-the-fly uses CPU time and saves bandwidth. Usually this is a great tradeoff given the speed of compression. There are ways to pre-compress static content and send over the compressed versions. This requires some specific configuration; even if it's not possible, compressing output may still be a net win. Using CPU cycles for a faster user experience is well worth it, given the short attention spans on the web.

 

Please publish modules in offcanvas position.