Ensuring sites are cache-friendly is an important part of deploying a website. Sites that load quickly and reduce bandwidth costs are great, especially when there are lots of visitors.
I've been experimenting with Rack-Cache which is an excellent project. It is a full-featured cache for Rack with support for multiple backends.
It is important to keep in mind that not all resources are suited for caching. For example, I made two separate caches as part of my application: A file cache and a dynamic content cache. This is because files on disk don't need to be stored in a cache, where as dynamically generated content does (otherwise you'd have to regenerate it each time).
Caching for resources such as files and other static resources should rely on
ETags. Each static resource has an ETag, which is typically a hash of the file size and last modified time. This is pretty easy to implement.
Caching for resources such as content that is dynamically generated should typically use last modified time exclusively, and typically for only a short period of time (such as 1 hour). This ensures that your site won't be overloaded generating content (when you get slashdotted), but that content will be regenerated fairly frequently.
Also, just because you are caching content, doesn't mean your page can't have dynamic elements -
AJAX can provide interactive RSS feeds, change images, change content, very trivially. This means that the majority of your content can be cached while specific parts are generated on the client dynamically. This is something which I'm experimenting with.
Debugging Cache Issues
I had problems because Apache was adding a second set of
Cache-Control headers to all requests. This was because of a global
ExpiresDefault directive, which simply appends another
Cache-Control header. This can cause incorrect cache information to permeate through the internet. Figuring out all the little problems took me a while since there are many levels which potentially cache information.
I found two great tools for checking whether your pages are serving the correct headers, and your stack responds to things such as
Both of these sites will point out issues with the content you are serving, and highlight potential problems with resources which won't be cached properly due to missing headers, incorrect headers and/or incorrect behavior.