Review of the Business Case
Before getting started, I just wanted to review the fictictious business case that we were going to solve that was covered in the previous post. We have an Employee Payroll System that we want to Scale Horizontally. We will do this by Sharding the Payroll database into 4 Shards with Employee ID's 1-25 in the first Shard, 26-50 in the second Shard, 51-75 in the third Shard and 76-100 in the final Shard. Using this approach, we can distribute the load across 4 databases instead of having all of the load handled by a single database. The layout of the Sharded databases can be seen in the diagram below.
Payroll System Database Setup
The next thing that we have to do is to set up the Powershell Libraries and the Azure SQL Database Runbook that we are going to use to create the Elastic Scale Databases to implement the Sharding. I'll go through these steps in case you want to set up the Runbook on your own Azure SQL Database. Where possible, I'm going to provide links for all of the steps because these steps have been covered in a lot better detail in other places.
- Create the Server - The easiest way to create an Azure Server for use with the Elastic Scale Databases is to create a sample database by going through the steps in this post. Remember the name of the Server, the Admin login and the Password that you created because we will need to add them to the Runbook later.
- Create an Azure Automation Account - The next thing that needs to happen is to create an Automation Account. In order to create an Automation Account, you first have to enable Automation in the Preview Portal. Once that is done, you just need to click on "Automation" on the left hand side of the Azure Portal, click the "Create" icon at the bottom of the screen and give a "Name" and "Region" for your Automation Account. Again, these steps (with pictures) are all summed up really well in the post Microsoft Azure Automation.
Upload PowerShell module to Azure Automation as an Asset - The next thing that we need to do is to upload a Powershell module that contains the Azure Elastic Scale dll (Microsoft.Azure.SqlDatabase.ElasticScale.Client.dll), some database helper functions that create and populate the databases (SqlDatabaseHelpers.psm1) and a Powershell data file that contains information about what is contained in the Powershell module (PayrollElasticScaleModule.psd1). In order to upload the Powershell module, please perform the following steps:
- Click on the Automation Account that you created in Step 2.
- Click on "Assets" at the top of the screen.
- Click "Import Module" at the bottom of the screen.
- Select PayrollElasticScaleModule.zip that is included with this post and then click the "Check Mark" to complete.
- Import the Azure Runbook - The last thing that we have to do is to import the Runbook. In order to import the Runbook, click "Runbooks" at the top of the screen, and then click "Import" at the bottom of the screen. Select PayrollElasticScale.ps1 that is included with this post and then click the "Check Mark" to complete.
If you've set up everything correctly, you should now see the PayrollElasticScale Runbook deployed to Azure as shown below.
Change Server Name, Username and Password
Create Shard Map Manager Database
Populate the Shard Map Manager Database
- [__ShardManagement].[ShardMapManagerGlobal] - Stores version information about the Elastic Scale Database
- [__ShardManagement].[ShardMapsGlobal] - Stores the ShardMapId and the name of the Shard Map that has been created. All of the rest of the information that we create in the Shard Map Manager Database will link back to this ShardMapId.
- [__ShardManagement].[ShardedDatabaseSchemaInfosGlobal] - Stores the schema information about the tables that are being Sharded.
- [__ShardManagement].[ShardMappingsGlobal] - Stores information about the Shards that get created such as the minimum and maximum values for each Shard.
- [__ShardManagement].[ShardsGlobal] - Stores information about which database contains each Shard.
- [__ShardManagement].[OperationsLogGlobal] - Stores information about operations that have been performed in the Shard Map Manager Database
We then add information about the tables that we are going to Shard by running the Powershell shown below. This step is quite important because it is here that we define the Shard Key, which in our case is the EmployeeID, and the table to which that key belongs - the Employee table. We also define the reference tables: the Payroll table and the PayrollDetail table.
Add Schema Information to the Shard Map Manager
At this point, only 3 tables in the Shard Map Manager database have data: [__ShardManagement].[ShardMapManagerGlobal], [__ShardManagement].[ShardMapsGlobal] and [__ShardManagement].[ShardedDatabaseSchemaInfosGlobal]. When we look at the [__ShardManagement].[ShardMapManagerGlobal] table, we can see the version information of the Elastic Scale technology that is being used.
When we look at the [__ShardManagement].[ShardMapsGlobal] table, we can see the ShardMapId and the Name that we have assigned to that Shard Map. We can also see that we have set up a Range Shard Map (ShardMapType = 2) and that our Shard Key is an integer (KeyType = 1).
Finally, when we look at the [__ShardManagement].[ShardedDatabaseSchemaInfosGlobal] table, we can see that for this particular Shard Map, an XML document has been defined that contains all of the information about the tables that we are going to Shard.
When we look at the XML document, we can see that it contains the Sharding Key (EmployeeID) and the table to which the Sharding Key belongs (Employee table).
XML Scema Information for the Shard Map Manager Database
I think that I'll probably wrap things up here for now as this post is starting to go on a bit. In the next post, I'll go into detail about the rest of the scripts which create the Shards and the C# application that can be used to run a query against all of the Shards.