Buy your own cloud, off the shelf

The hybrid cloud as we have known it is a marriage of two very different architectures forced to work together. It was a workaround for people who wanted to move to the cloud but were unsure. Or had way too much invested on-premise but were willing to take some systems to the cloud. The tools were forcefully modified to work with both. The on-premise tools were providing plugins for the Cloud and vice-versa creating a mesh of toolsets making it difficult for DC admins to work with the so called “Hybrid” cloud. It was nothing more than a work-around to prevent people from losing their On-Premise investments and still be able to use the cloud for something, if not everything.

But Hybrid is passe. The future is called (or at least should be called) the Seamless Cloud. The same architecture, same tools, same features and services for the Private AND the Public cloud.

Introducing, Azure Stack. A cloud appliance that makes the Public Cloud Private.

If there is one technology that has the power to make a wave like what Windows did for Personal Computing, it’s Azure Stack. Think about it… being able to Install the Cloud anywhere you want, expanding into the public cloud without having to think about changing your design or architecture.

Azure stack opens up many possibilities and makes developers life much easier. Most importantly it brings PaaS to On-premise, and that in my mind is a huge efficiency for developers. All those Open Source tools that you would otherwise take up your time to setup are now available as out-of-the-box services on a tried and tested cloud scale architecture. Though currently only a limited set is available, but what’s possible in the future looks exciting.

What’s also really cool is that all of Azure Market Place services are available on Azure Stack. So all those great services built by our partners and independent software vendors that are available on Azure Market Place will run on Azure Stack.

In my personal opinion, Azure Stack should not be even called as Hybrid. Hybrid by definition is a combination of two very different (often opposing) entities. While Azure Stack is the public cloud in your data center.

A one line definition of Azure Stack: Azure Stack is the Azure IaaS and part of Azure PaaS installed in your Datacenter.

It is available as an Appliance from Dell, HPE, Lenovo, and Huawei.

Avanade is about to join the list.

This Whitepaper on Azure Stack is a good place to start understanding the concepts.

In this article, I wanted to cover some real industry scenarios ideal for Azure Stack.

Global corporates, local laws

This is my mind is one of the most compelling applications of Azure Stack. Global companies are building systems for their internal and external consumption but are challenged by local country laws when it comes to putting data on the cloud. Currently, they have only two options,

  1. Build their systems using traditional frameworks and host it as an IaaS on the public cloud where it’s allowed and move it on-premise in locations where laws are not cloud-friendly. Advantage is that they don’t have maintain two separate versions (mostly). But they lose out on all the advantages of the PaaS cloud brings to the table.
  2. Build a cloud version using a combination of PaaS and IaaS as appropriate for the public cloud and build a separate on-premise version for countries where laws are not cloud-friendly. Advantage, at least the cloud version is using best-of-breed services and all the cloud advantages. Disadvantage, multiple versions being maintained for cloud and on-prem deployments.

This is an ideal scenario for Azure Stack. Keep the same architecture on Premise and in the cloud and reap all the benefits of the cloud. But deploy on-premise or in the cloud at will.

Value Added Services for heavy industry automation

Manufacturing automation companies have embedded control systems that provide monitoring and diagnostics for these high-cost and critical equipment. Lot of times they want to collect all this information from entire shop floor and analyze data, provide balcony-view dashboards etc. Latency is important and none of the data really needs to go to the cloud. Currently the only choice for them is to build these applications using a host of different platforms and frameworks and then install all this on a local server in the factory connected to the shop-floor Sensor Gateways. And then maintain upgrades and patches for these servers. Selling just the software means different setup environments and a maintenance and upgrade nightmare.

Azure Stack is once again an ideal solution here. All the required IoT and Analytics frameworks bundled as services on a box. And the deployment could be cloud based using AKS (Azure Container Services). The automation vendor can bundle an Azure Stack appliance as a part of their automation system setup. All the heavy IoT ingestion, analytics etc. can be on this appliance. The industry term for this is “Purpose-built Integrated Systems”.

Online-Offline systems

Online/Offline applications have been around for a while to support applications that work in an environment where connectivity is not always guaranteed. But these scenarios were limited to a Client-Server setup. In situations where connectivity is not particularly good, but you need the application to keep working and collecting data. Simple data sync platforms would then sync the Data from the Client to the Server.

But applications have now evolved. Client systems now have to do critical analytics and a lot of heavy lifting. In such situations too Azure Stack would be an ideal fit. A system that can be completely functional in a disconnected mode and sync up when the connection is available.

These are just a few scenarios for Azure Stack that we see in the industry. But it’s only limited by your imagination.

Caching in on Scale and Performance – Part III

In this concluding part of the article, I would like to discuss the various Cache management methodologies. But before we delve in that, Architects and Software Designers of today need to first make sure they are using the right Database/Storage technology. Because a Distributed Scalable No-SQL service like CosmosDB with it’s concept of request units (RUs) might not even need caching. The scale issue is solved by partitioning, distributing and scaling the database itself.

But if your design calls for a classic RDBMS style database, then a Caching layer and caching techniques need to be thought through.

In this section we will cover Cache Updates techniques and Caching Infrastructure considerations.

Fetch on miss

Most basic Cache systems are designed to be empty to begin with. When the application needs data, it tries to read it from the Cache. Since the cache is empty, it generates a “Not found” event which then can trigger a Database fetch. All subsequent reads can then fetch from the cache.

The advantage of this method is that you don’t have to populate the entire cache with data which may or may not be used. Only the required data is uploaded to cache. So you save on space and hence infrastructure cost. If you manage cache TTL (Time to live) properly, using this method you can very efficiently manage your Cache with a minimal infrastructure by keeping only the most frequently used data in the cache and nothing else.

The disadvantage of this method is that cold run for the application has is slow on response time as the data needs to get loaded.

This method is ideal for scenarios where only some parts of the data is being used frequently and occasional cache miss cost is acceptable to the users. Small cache is more important.

Pre-loaded Cache

Here you pre-load the entire cache-able data all at once. And then only update the database as and when data changes using one of the Cache Update methods (described later). This is an Anti-pattern. Loading everything into Cache could undo the performance benefits because of the added Cache management. While initially it might be inefficient, the system should eventually evict cache that not being used and come to optimum cache store.

Cache Eviction Policy

If you are using Redis, you can use a combination of TTL (Time to Live) and Expire commands to manage the Cache optimally. A good Cache Eviction policy can help you manage the size and availability of your Caching system.

Cache Update Methods

These are the standard patterns of updating your cache. Each pattern has it’s merits and demerits.

Write Through: When data changes, it is simultaneously updated to the Cache and to the Database. Advantage is consistency between Database and Cache. Disadvantage is keeping everything in the cache updated whether needed or not.

Write Around: Data is first written to the Database first to ensure that data is persisted first and then fetched into the Cache when accessed. The write logic can expire the cache when it writes to the database, so that the application knows that the data needs to be fetched when a Cache hit comes in.

Write Back: Data is first written to the cache and asynchronously updated to the database. Data loss risk is very high. So this method must be used only when data loss is affordable but data access needs to be very fast. This method can be used when the Cache layer is replicated and hence loss of one Cache server will not impact the database update.

Hopefully, this three-part article has covered areas of caching that most people are concerned with. One of the purposes of writing these articles was to make my job easier. I don’t have to point customers in different directions when caching is being discussed.

Caching in on Scale and Performance – Part II

In Part I of this 3-Part article, we looked at the importance of caching and cost of not doing so. We then built a sample application with Redis Cache as an example.

Going back to our Cash-in-the-wallet example from the previous article, the entire transaction chain from the Bank to Wallet has many locations where money can be held in smaller quantities. The ATM has some part of the money. At the Bank Branch, the teller’s drawer has some cash stored while the bigger pile of cash is probably in the back of the bank inside a large vault. There might be an even bigger stash of cash at the bank HQ. Armored vehicles keep moving cash between locations.

This is very similar to the situation with Data. Cached data can be found across the application tiers. Some of them might be completely transparent to the Developer (SQL Cache, Browser Cached Pages etc.) while some Caching needs to be built grounds-up by the developers (App Tier caching, Page level caching using JavaScript and JSON etc.).

Cache

The above diagram depicts all the places Data can be cached. The red arrows indicate expensive network trips to fetch data that adds latency and reduce performance. The diagram is agnostic of Cloud or On-Prem solutions.

So, the question now is, what to cache and where? The key concept to note here is that SQL Server Database is your single source of Truth. Which means that while all updates to data must be written to the Database, every piece of data need not be fetched from the database.

Create a data heat map

The most important characteristics for Caching is the frequency of updates to data values. Some data like Countries, Cities, Zip Codes, Names of people, Date of birth etc. won’t change. Then there is some data can change but not too often. Customer address, Customer Preferences, Software Customization, Customized Screen Layouts are examples where there may be change, but not that frequently. And then there is real-time transactional data like Bank Balances, Instrument Values in Hospitals etc., that needs real-time read-writes to permanent storage and differences between permanent storage and cache can create big business issues.

What you need to do is to look at your entire Application data and split them into the following three categories:

  1. Data that never changes
  2. Data that could change every few months
  3. Data that changes Daily
  4. Real-time data, changes every second

Once you have these broad categories, you can then decide where to cache the Data. The first two categories, depending on the volume of data, can be cached in the web page as JSON objects managed by JavaScript or HTML5 Session Store. The third Category can stay closer to the Database in a Clustered and Load Balanced Cache system. The last one needs to be fetched from Database directly (But that trip can also be avoided by using Write-Through Cache mechanisms).

In the next (and the concluding part) we will discuss Cache usage patterns and Architecting the Cache sub-system for scale.

Caching in on Scale and Performance – Part I

he Technologies covered: Azure SQL Database , Azure Redis Cache


Say it’s lunch time and you are standing in line for a sandwich. What’s the first thing you do? Check your wallet for cash? Right? If you don’t have cash, you go to an ATM to get it. But you most definitely don’t go to the bank branch and withdraw cash for this. And when you go to the ATM, do you withdraw just enough cash for the Sandwich. No. You take out an amount that you think you will need for a few days.

Going to your SQL Server Database (or any RDBMS) to fetch data every time your client application needs is like going to the Bank to withdraw cash every time you need cash. And while going to the bank to withdraw money might have been the modus operandi a few decades ago, it most definitely is not the right way to operate today. The same holds for Applications. A decade ago Client-Server systems operated this way. But with all the various scalable caching options available today, it makes no sense to keep going to the database for fetching every piece of data.

For over a decade, I have been helping customers scale their Client-Server or Web applications. A decade ago no one thought about Caching. Which was obvious. There weren’t too many Caching/No-SQL technologies and RDBMS was the only Database choice. But, I observe that even today, with all the No-SQL and Caching technologies available, many customers have still not given it a thought.

So, I decided to demystify the importance of caching in a 3-part blog series. This one covers the “Why Cache?”. The next will cover “What can you cache?”. And in the third one I will cover Caching Design and Architecture with Redis cache as an example.

Disclaimer

With a host of distributed No-SQL Database technologies available, there might be other Database options that can get you similar benefits. So, this blog series will focus on Traditional Client-Server systems built on SQL Server like RDBMS technologies either On-Premise or in the Cloud.

The Build-up to a clog

This is a typical development pattern that I have observed around how companies land up in situation where their systems start stalling and are unable to scale.

Stage 1: All new application development starts from a developer writing code against a test database. Every time he hits the F5, the application zips to the database and shows the data on his application at lightning speed. And why wouldn’t it? He and a few others are the only ones connected to the database.

Stage 2: This application is then deployed to production and starts getting used by a few hundred people. Performance drop is not particularly evident.

Stage 3: As the popularity of the application grows and more and more people start using it, there is a noticeable drop in performance and that is when the Development team goes back to the whiteboard and starts to find solutions. But usually at this point, re-architecture is a huge cost.

Hence, building Caching into your architecture and design right from the beginning!

A caching example

Lets us walk through the process of building a simple web page that uses Redis Cache as opposed to a Database. I have built two web-based application applications on Azure. One connected to a Azure SQL Database and one connected to Azure Redis Cache. This article assumes that you have a basic understanding of Azure Services and .Net coding.

The database used is the AdventureWorks sample available on Azure when you set up a Azure SQL Instance.

Azure DB Create 2

Then I setup a Redis Cache service on Azure and called it AWCache

Azure Redis Cache Create 3

Once the database and cache were setup, I built a Console Application that builds the Cache from the Database.

Here are a the basic steps of constructing the Cache:

Prerequisites: NewtonSoft JSon, StackExchange Redis Library . Use Nuget to import the latest packages into Visual Studio.

Build a Join Query that gets ProductCategory X Product on Category ID:

select SalesLT.ProductCategory.ProductCategoryID,
SalesLT.ProductCategory.Name, 
SalesLT.Product.ProductID,SalesLT.Product.Name 
from SalesLT.Product, SalesLT.ProductCategory 
where SalesLT.ProductCategory.ProductCategoryID = SalesLT.Product.ProductCategoryID 
order by SalesLT.ProductCategory.Name

Build a Cache Structure to load the Join query into.

//Structure to store Category X Products
 public class CategoryXProduct
 {
    public string CategoryID;
    public string CategoryName;
    public List<Product> lstProducts;
    public CategoryXProduct()
    {
       lstProducts = new List<Product>();
    }
 }

 //Structure to store Products
 public class Product
 {
    public string ProductID;
    public string ProductName;

    public Product(string ID, string Name)
    {
        ProductID = ID;
        ProductName = Name;
    }
  }

Connect to the Cache and the SQL Database

//Connect to Cache Database with a ConnectionMultiplexer Helper Class
 IDatabase cache = Helper.Connection.GetDatabase();
 var server = Helper.Connection.GetServer("AWCache.redis.cache.windows.net:6380");
 
 //Clear the Cache
 server.FlushDatabase();

//Connect to AdventureWorks Data on Azure SQL
 dbConn = new SqlConnection("Server=tcp:mtcdemo.database.windows.net,1433;Initial Catalog=AdventureWorksDB;MultipleActiveResultSets=True; Persist Security Info=False;User ID=anideswandikar;Password=@ni123@ni123;MultipleActiveResultSets=False;Encrypt=True;TrustServerCertificate=False;Connection Timeout=30;");
 dbConn.Open();

//Run a loop on the DataReader and load the results of the query to the Cache.
reader.Read();
 cxp = new CategoryXProduct();

 string categoryID = reader.GetValue(0).ToString();
 string categoryName = reader.GetValue(1).ToString();
 string productID = reader.GetValue(2).ToString();
 string productName = reader.GetValue(3).ToString();

 cxp.CategoryID = categoryID;
 cxp.CategoryName = categoryName;
 cxp.lstProducts.Add(new Product(productID, productName));
 
 //Remember the current CategoryID to detect a change
 prevCategoryID = categoryID;

 int i = 1;

 //Loop and watch for CategoryID Change
 
 while (reader.Read())
 {
   categoryID = reader.GetValue(0).ToString();
   categoryName = reader.GetValue(1).ToString();

   //If Category is same, add product
   if (categoryID == prevCategoryID)
   {
      productID = reader.GetValue(2).ToString();
      productName = reader.GetValue(3).ToString();
      cxp.lstProducts.Add(new Product(productID, productName));
   }
   //Add the Object to Cache and start a new Object
   else
   {
     cache.StringSet(prevCategoryID, JsonConvert.SerializeObject(cxp));
     i++;
     cxp = new CategoryXProduct();
 
     //Get te new Category name and first product

     productID = reader.GetValue(2).ToString();
     productName = reader.GetValue(3).ToString();
     cxp.CategoryID = categoryID;
     cxp.CategoryName = categoryName;
     //Add the CategoryXProduct item to Cache
     cxp.lstProducts.Add(new Product(productID, productName));
     //Remember the new CategoryID
     prevCategoryID = categoryID;
    }
 
   }

 }

Once the cache is built, I built two very similar looking Web Applications. Once connects to the Database and one to the Cache. The Cache code looks something like this:

 string id = DLCategory.SelectedValue;
 CategoryXProduct item = 
    JsonConvert.DeserializeObject<CategoryXProduct>(cache.StringGet(id));
 foreach (Product p in item.lstProducts)
 {
 LBProducts.Items.Add(p.ProductName);
 }

BothApps

And voila! We just built a page in an Application that never has to go to the database for getting Data. But, what if a new Product is added or discontinued? Or a a new Product Category gets added. That is what we will cover in the next topic. What to cache? And the various caching methodologies.