This will be a two part post on some of my thoughts on designing a caching system for a service oriented architecture, and some of the results from a series of prototypes done to flesh out the design.
Part 1 – Overview, use and potential approaches
Part 2 – Prototype designs, results and lessons learned
When designing a service oriented architecture (SOA) that is expected to see high volumes of traffic one of the potential architectural components you may be looking at is a caching system. In a high traffic system a cache can be essential in increasing performance and enabling the scalability of the overall system.
Why use a Caching system?
Caching systems inarguably add another layer of complexity as well as another potential point of failure to a systems architecture, so it’s use should be carefully weighed in relation to the expected benefits you anticipate from its use.
There are usually two main reasons for employing a caching system
- Offload Database work
- Drive down response times
Offloading database work is essentially a way of enabling a pseudo-scaling of the database tier, especially in cases where the database platform doesn’t inherently allow a scaling out. By performing more of the work of retrieving data without involving the database, we are effectively scaling out the capacity of the tier.
The requirement to Drive down response times or keep response times stable as the system grows is another common reason to employ a caching system. Retrieving data from a cache held in memory is magnitudes faster than retrieving it from the database tier in most circumstances and especially so if the database system itself does not employ some internal caching to keep the needed data in memory. If the database system determines it needs to read the data from disk, the cache fetch will seem like a Maserati compared to a ‘Model T’.
Where should you use caching?
Caching should likely be considered at all levels of the system – client tier, web tier and services tier. At each tier of your architecture the needs of the application and the type of data in use will dictate what gets cached and the strategy employed.
The goal of caching within a SOA system that is expected to scale means keeping the cached information at the layer that makes the most sense from a use and manageability standpoint.
At the client tier data should be cached to avoid round trips back to the server when possible. This is likely one of the most expensive calls in a system that can be made, as the network traversed in this call is likely a good distance from the web application or services tier. The best approach to performance here is to not incur the overhead of the call at all if possible.
Thick clients have long used local caching strategies to hold onto data as long as possible. After a database call a thick client would keep the set of data retrieved in memory between user actions and screen changes.
Browser based clients have had a more difficult time caching data due to the stateless nature of the web. Some approaches here have been to store data in the page itself. This can lead to page size bloat and slower response times in a typical scenario such as ASP.NET where the “stored data” is round-tripped with the page. With the rising popularity of AJAX style programming and partial page refreshing, the browser is becoming a more intelligent presentation layer compared to the typical post-back or complete page refresh model.
Web Application Tier
The web application tier has a role to participate in the caching strategy as well. Since in a proper N-tier system, the web application tier is responsible for serving of resources (pages, images, etc), it should employ its caching strategy around these object types primarily. Employing a cache strategy around how long a page can be served from cache versus being regenerated should be one of the primary focuses of caching at this tier.
While tempting to cache data at the web application tier, this should be avoided as there are several problems that could arise from this in a scaled and load balanced environment, such as data only being available to certain web servers, or the distributed maintenance of a cache from multiple web servers.
Caching at the services tier should target “data” since this is the single point of access to data within a SOA based system. As such it makes sense to control the population, refreshing and invalidation of a data cache from this tier. The services tier lends itself particularly well to the caching of data as its primary purpose it to act as the facade that serves all requests to retrieve or update data.
Where it retrieves this data from is of no concern to the caller other than from the standpoint that the data is correct and accurate. Since employing a cache at this tier is transparent to the caller, offloads work from the database and is more manageable from the standpoint of trapping changes that require updating the cache, it makes the most sense to cache data at this tier.
What types of data should you cache?
The type of data that you determine should be cached should ultimately provide an increase in performance to the system without dramatically increasing the complexity of the system. There are certain types or classes of data that make more sense to cache than others, in order of priority.
- Data that changes infrequently
- Expensive queries
- Data that is accessed frequently
Some thought should also be given to the dependencies between cached object types. I cover this more in the considerations section, but a high number of dependencies between objects may be a factor in determining whether you cache these object types.
Data that changes infrequently
Data items that are fairly static in nature make ideal candidates for caching. The benefit here is that there is a low overhead to managing this type of data in a cache as updates to the data are infrequent requiring less of a need to clear items from the cache and/or refresh them.
An example of data that changes infrequently could be policy data that drives certain actions within the application.
Queries that are expensive in either time or resource usage to run are another ideal candidate for caching. Caching this type of data will provide the aforementioned “scaling” increase at the data tier since the underlying database is freed from running the majority of these queries, allowing it to run other queries which in effect provides the same benefit as scaling the database system.
Examples of an expensive query might be a query that aggregates several pieces of data together or performs some level of trending, along the lines of something you may see in a dashboard style view.
Data that is accessed frequently
Data that is accessed frequently also makes a nice candidate for caching since this type of data – even if cheap to execute and return – provides a constant load on the underlying system. Being able to effectively take this constant load off of the database and move it to the cache can yield significant performance improvements.
So in general there are a couple of factors that drive the cost/benefit analysis as to what should be cached: cost of data and frequency of change:
|Cost of Data
||Frequency of Change
||Benefit of Caching
||Best not to cache
* While the cost of the data is very high to execute and retrieve, the benefits of caching are reduced by the frequency of change since the frequent changes will lead to high cache turnover, frequent refreshing/re-querying of data and an overall higher level of data management for this item in the cache.
Two approaches we designed and tested in a prototype were the “Write Through” and “Data Event Driven” approaches. Both approaches have advantages and disadvantages associated with them and should be carefully considered in the context of how the system is used and the way in which data is interacted with.
The write through cache is a a cache implementation where the cache is updated during the operation that is updating a piece of data, followed by a subsequent updating of the data in the underlying data store. In essence this is a cache first and data store second type of model.
This type of model is most effective in a system where there is a well defined set of interfaces for interacting with the data and all areas of the system use (ie. single source for all data interactions). A single source of updates allows for a more manageable point to maintain the data in the cache from, whereby a single component or code path is responsible for the update or refresh of the cache for a given operation.
Another considerations in utilizing this type of design is to think about how concurrency of updates occur in the system. There exists the possibility that there can be two distinct operations that are updating a piece of data, both of which are attempting to first change the data in the cache and then in the data store. Most typical data stores provide a mechanism for handling concurrent updates, usually through a locking mechanism. This may or may not be the case in your cache provider.
Since the first update occurring in the system will be to the cache (not the data store) it is essential that there be some mechanism to effectively handle the ability for concurrent attempts to update data. As mentioned some cache implementations provide the ability to lock a cache item while it is being updated effectively reproducing the same behavior as the data store.
- Easy to implement given a single point of interaction to data within the system
- May need to implement concurrency handling for updates to the cache
- Calls that directly modify the database or do not use the “single point of interaction for data” can lead to stale cache data items
Data Event Driven
The data driven model is a cache implementation where the signal for a change to the cache comes from a change to the underlying data in the data store. Basically the data “signals” that it has changes and the cache is updated based on this.
There are two ways to handle a data driven design – the push or pull method.
In the pull method, there would exist a way to actively monitor the underlying data to detect changes and then react to those changes by updating the cache. An example of this would be using something like the SqlDependency feature in Microsoft ADO.NET. Typically you setup a query to watch some set of data and when the results of that query changes you are signaled and can react to the change. In my opinion and through testing this does scale well to larger systems since the number of items that need to be setup and then polled – utilizing system resources – can grow to a large number. For simpler systems or those that do not have a requirement to cache a large number of different data types, this may be appropriate.
In the push method, the data store itself would have a mechanism to watch a set of data for changes and signal that the data has changed to an interested party. Our prototype used a combination of triggers and SQL CLR code to detect when changes to a set of data we were interested in occurred and then raise a signal to the cache implementation to refresh the associated data. The benefit of this method was that there was a low overhead associated with tracking the changes to the data versus polling. One of the disadvantages to this approach is the maintenance of the triggers in the system. As the need to track more and more data items grows the number and complexity of the triggers to detect and deal with changes also increased.
A common disadvantage to both flavors of the Data Event Driven design is that the rollup from data in a normalized database to something that is typically stored in a cache – such as an aggregated type of data object – was exceptionally difficult. Translating what data should be watched for a given object such as a Customer, which might span three normalized tables in the data store was a chore. In our prototype it meant a minimum of three triggers – one for each table – to watch a portion of the data for a Customer, and then the ability to translate a change detected by any of those three triggers into a ‘Customer’ object that was held by the cache.
The obvious advantage and appeal of this type of system is that any and all changes to the data can be caught and propagated to the cache layer for updates as needed. There would be no stale data in the cache if someone wrote directly to the database or skirted a centralized update control point.
- All changes are accounted for the in the data
- Reactive – low to no resource usage for monitoring changes
- Hard to implement.
- Translation from physical to logical can be tricky
Dependencies between cache objects
One of the more challenging aspects of a cache system implementation is how your define and manage any dependencies between data items within the cache. This should be considered in the overall design of the system as to how data is stored with especial attention given to how granular or coarse the items you are putting into the cache are. You should shy away from an implementation where you are storing very discrete data items that cannot stand on their own as having business value. Data items that are only useful when aggregated together may not make the best candidates for being cache. After all joining items together to return data with business value is the purview of the database and not necessarily that of the cache.
Cache Refresh Strategy
There are two approaches to consider when removing a piece of data held in cache is invalid or stale – actively refresh the data by fetching and replacing it, or remove the data from cache and let the next request fetch and cache the data.
In considering these approaches one factor that might drive the decision of one over another is the overall expected volume of traffic. If the expectation of the system volume is very high, it could be that “removing the cache item and letting the next requestor fetch the data” strategy could result in severe spikes in utilization at the data tier as ‘N’ number of requestors all looking for the same piece of data removed from the cache, go and request it from the database. This could possibly be mitigated if the cache supports the concept of locking on a cache item key that isn’t in cache while the data is being fetched.
Single Server Cache vs. Distributed Cache
Some thought should be given to whether a single cache server and the resources available to it would be sufficient for your implementation. If high availability of the cache is a requirement, you may need to consider a distributed cache implementation, which provides for storing multiple copies of your data within the cache for failover and high availability purposes.
I’d be interested in hearing any feedback on the ideas in this article or learning’s you many have from implementing a similar large scale caching implementations in support of a SOA based system.