MySQL is an incredibly powerful and robust database which is why it powers so many web sites and web applications today. However, even the best database often needs additional help to keep up with the high volume demands of popular sites or web applications. MySQL is no exception. One way that many developers squeeze additional performance from MySQL is through caching in order to reduce the number of queries hitting the database. There are a number of techniques for caching MySQL queries.
The simplest system for MySQL caching is a file based system. In this style of caching, a query is stored in a file and the data is loaded from the file, thus saving a hit on the database. The most common application of this would be in a content management system where perhaps the content of a web page is stored in a MySQL database. This content probably does not change frequently. A simple cache might involve writing out the generated page to static files that could be served by the web server. Another technique would be to write out the results of a MySQL query to a file in XML format. This could be used for report generation or other data intensive functions that need MySQL caching to improve performance.
Another system of MySQL caching involves some sort of application caching. Most web development platforms and languages have their own caching mechanisms. For example, PHP provides API to Memcache framework for building custom caching classes. These classes can be configured to cache all or part of a page and control which queries will be cached and which parts need to be fetched on each request. These MySQL caching systems provide a greater level of flexibility because of their integration into the development platform.
Finally, MySQL has a built-in query cache that can substantially improve its performance. This cache can be configured through MySQL’s server configuration file, typicall my.cnf. MySQL’s query cache can be turned off manually by the client so there is some granular control over which queries to cache. It’s also important to thoroughly read the MySQL query cache documentation to understand which types of queries can or cannot be cached for MySQL. For example, queries that contain bind parameters or queries that use functions like CURRENT_TIMESTAMP cannot be cached by the MySQL query cache. You’ll need to design your queries appropriately to take full advantage of the cache.
Choose Cache Wisely
It is important to match the MySQL caching system to the type of content being cached and the business rules for determining when to expire (remove) cached items. File caching systems break down when more than one server is involved. Each file would need to be cached on each web server. This quickly adds to the complexity of MySQL caching. In a similar fashion, data that changes frequently or relies on fields that change from query to query, such as a TIMESTAMP or random value, require a different approach to caching. These types of queries usually need to be segmented in some way and only the less dynamic parts cached. This is also a scenario where caching snippets of data can help improve performance.
Once you’ve decided on a way to perform your MySQL caching, you’ll need to determine how to tell if the items in the cache are still valid. As data changes or updates are made, your cache will need to decide whether to serve the cached data or to re-query the database for a fresh copy. There are several methods for determining this and each method works better for some types of MySQL caching.
Time to Live (TTL)
The simplest method is the TTL or Time to live method. This method assigns an amount of time for which the cached data will be considered valid. This is the same method used by the HTTP expire header. In a TTL cache, the data is refreshed whenever it expires. This method is the simplest to implement. It works extremely well with file based MySQL caching. Your caching code can simply look at the file modification time and determine if a cached file is too old. If it is, the data is retrieved from the database and a new file made in the cache.
Invalidate on Update
Another method involves invalidating items in the cache whenever the underlying data is updated. At a very basic level, this is how the built-in MySQL query cache works. It invalidates cached queries whenever a table upon which the query is based gets updated. With this method, your system will need a way to determine which cached objects are associated with what data and you’ll need a messaging or notification system when data is updated. In this type of system, an update will trigger a notification to the cache to invalidate the cached object. When a new request comes in for that object, it would be updated and returned to the cache. The biggest drawback to this system is that the process of marking items in the cache as invalid can be resource intensive if you have many cached items or data updates very frequently. Another way to think of this method is that it is very similar to the garbage collection processes in modern programming languages like Java.
Update vs. Invalidate
A variation on invalidating the cache is to update it. In this method, when data is updated and the notification is sent, the cache updates the cached object rather than simply marking it as invalid. This takes much more design work as your cache needs to have an awareness of how to create or update the objects stored in it. This usually means that this method works best with caching systems that are integrated into the development platform so that the cache can call the methods needed to recreate the various objects as they are updated.
In this method a version is attached to the data used by an object. Rather than sending notifications on updates to the data, the version number is increased. The cache compares the version of cached objects to the stored versions and updates the objects when the version does not match. This saves a lot of the overhead of large sets of objects being invalidated or some of the complexity of updating objects each time data is updated.
MySQL caching is an important way to improve the performance of your web platform. However, implementing MySQL caching takes some care and planning to yield the best performance gains. It is crucial to consider the type of data being cached, how frequently it will be updated, how to cache objects and even more importantly, how to know when to remove objects from the cache.