Caching in on Scale and Performance – Part II

In Part I of this 3-Part article, we looked at the importance of caching and cost of not doing so. We then built a sample application with Redis Cache as an example.

Going back to our Cash-in-the-wallet example from the previous article, the entire transaction chain from the Bank to Wallet has many locations where money can be held in smaller quantities. The ATM has some part of the money. At the Bank Branch, the teller’s drawer has some cash stored while the bigger pile of cash is probably in the back of the bank inside a large vault. There might be an even bigger stash of cash at the bank HQ. Armored vehicles keep moving cash between locations.

This is very similar to the situation with Data. Cached data can be found across the application tiers. Some of them might be completely transparent to the Developer (SQL Cache, Browser Cached Pages etc.) while some Caching needs to be built grounds-up by the developers (App Tier caching, Page level caching using JavaScript and JSON etc.).

Cache

The above diagram depicts all the places Data can be cached. The red arrows indicate expensive network trips to fetch data that adds latency and reduce performance. The diagram is agnostic of Cloud or On-Prem solutions.

So, the question now is, what to cache and where? The key concept to note here is that SQL Server Database is your single source of Truth. Which means that while all updates to data must be written to the Database, every piece of data need not be fetched from the database.

Create a data heat map

The most important characteristics for Caching is the frequency of updates to data values. Some data like Countries, Cities, Zip Codes, Names of people, Date of birth etc. won’t change. Then there is some data can change but not too often. Customer address, Customer Preferences, Software Customization, Customized Screen Layouts are examples where there may be change, but not that frequently. And then there is real-time transactional data like Bank Balances, Instrument Values in Hospitals etc., that needs real-time read-writes to permanent storage and differences between permanent storage and cache can create big business issues.

What you need to do is to look at your entire Application data and split them into the following three categories:

  1. Data that never changes
  2. Data that could change every few months
  3. Data that changes Daily
  4. Real-time data, changes every second

Once you have these broad categories, you can then decide where to cache the Data. The first two categories, depending on the volume of data, can be cached in the web page as JSON objects managed by JavaScript or HTML5 Session Store. The third Category can stay closer to the Database in a Clustered and Load Balanced Cache system. The last one needs to be fetched from Database directly (But that trip can also be avoided by using Write-Through Cache mechanisms).

In the next (and the concluding part) we will discuss Cache usage patterns and Architecting the Cache sub-system for scale.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s