Divide a data store into a set of horizontal partitions or shards. This can improve scalability when storing and accessing large volumes of data.
Once the hardware resources, server nodes, for deploying a distributed database are available, a distribution model should be chosen to leverage the cluster capacity. Roughly, there are two paths to data distribution:
- Replication and
Replication can be performed in two ways: master-slave and peer-to-peer. In a master-slave scheme, one node is responsible for processing any updates to the data and a background process is responsible for synchronising the data across the other nodes, the slaves. That kind of replication is recommended for intensive data read applications. To increase the cluster capacity of processing read requests, more slaves can be added. However, the write throughput is limited by the capacity of processing write requests of the master node. In peer-to-peer replication, there is no master node. All the replicas can accept write requests, thus improving the system capacity of handling write requests. on the other hand, peer-to-peer replication clusters have to deal with inconsistency problems that may arise.
Sharding is a way of scaling out a database via horizontal fragmentation. A horizontal fragment of a relation (table) is a subset of the tuples in that relation. The tuples that belong to the horizontal fragment are specified by a condition on one or more attributes of the relation. Often, only a single attribute is involved. That is, the horizontal scalability is supported by putting different parts of the data onto different servers of a cluster. The objective is making different clients talk to different server nodes. Consequently, the load is balanced out nicely among the servers. Sharding is particularly valuable for performance since it can improve both read and write performance. Replication can greatly improve read performance but does little for applications that have several write operations. Sharding provides a way to horizontally scale those operations.
Partitioning Strategies in Databases
There are four different strategies for partitioning data across a cluster:
- List and
- Hash partitioning.
The simplest partitioning strategy is the round-robin, which distributed the rows of a table among the nodes in a round-robin fashion (Figure 1a).
For the range, list, and hash strategies, an attribute, known as partitioning key, must be chosen among the table attributes. The partition of the table rows will be based on the value of the partitioning key. In the range strategy, a given range of values is assigned to a partition. The data is distributed among the nodes in such a way that each partition contains rows for which the partitioning key value lies within its range (Figure 1b).
The list strategy is similar to the range strategy. In the former each partition has a list of values assigned one by one. A partition is selected to keep a row if the partitioning key value is equal to one of the values defined in the list (Figure 1c). In the latter, the mapping between the partitioning key values and its nodes is based on the result of a hash function. The partitioning key value is used as parameter of the hash function and the result determines where the data will be placed (Figure 1d).
Sharding strategies in Azure
Three strategies are commonly used when selecting the shard key and deciding how to distribute data across shards. Note that there doesn’t have to be a one-to-one correspondence between shards and the servers that host them—a single server can host multiple shards. The strategies are:
The Lookup strategy. In this strategy the sharding logic implements a map that routes a request for data to the shard that contains that data using the shard key. In a multi-tenant application all the data for a tenant might be stored together in a shard using the tenant ID as the shard key. Multiple tenants might share the same shard, but the data for a single tenant won’t be spread across multiple shards. The figure illustrates sharding tenant data based on tenant IDs.
The mapping between the shard key and the physical storage can be based on physical shards where each shard key maps to a physical partition. Alternatively, a more flexible technique for rebalancing shards is virtual partitioning, where shard keys map to the same number of virtual shards, which in turn map to fewer physical partitions. In this approach, an application locates data using a shard key that refers to a virtual shard, and the system transparently maps virtual shards to physical partitions. The mapping between a virtual shard and a physical partition can change without requiring the application code be modified to use a different set of shard keys.
The Range strategy. This strategy groups related items together in the same shard, and orders them by shard key—the shard keys are sequential. It’s useful for applications that frequently retrieve sets of items using range queries (queries that return a set of data items for a shard key that falls within a given range). For example, if an application regularly needs to find all orders placed in a given month, this data can be retrieved more quickly if all orders for a month are stored in date and time order in the same shard. If each order was stored in a different shard, they’d have to be fetched individually by performing a large number of point queries (queries that return a single data item). The next figure illustrates storing sequential sets (ranges) of data in shard.
In this example, the shard key is a composite key containing the order month as the most significant element, followed by the order day and the time. The data for orders is naturally sorted when new orders are created and added to a shard. Some data stores support two-part shard keys containing a partition key element that identifies the shard and a row key that uniquely identifies an item in the shard. Data is usually held in row key order in the shard. Items that are subject to range queries and need to be grouped together can use a shard key that has the same value for the partition key but a unique value for the row key.
The Hash strategy. The purpose of this strategy is to reduce the chance of hotspots (shards that receive a disproportionate amount of load). It distributes the data across the shards in a way that achieves a balance between the size of each shard and the average load that each shard will encounter. The sharding logic computes the shard to store an item in based on a hash of one or more attributes of the data. The chosen hashing function should distribute data evenly across the shards, possibly by introducing some random element into the computation. The next figure illustrates sharding tenant data based on a hash of tenant IDs.
To understand the advantage of the Hash strategy over other sharding strategies, consider how a multi-tenant application that enrolls new tenants sequentially might assign the tenants to shards in the data store. When using the Range strategy, the data for tenants 1 to n will all be stored in shard A, the data for tenants n+1 to m will all be stored in shard B, and so on. If the most recently registered tenants are also the most active, most data activity will occur in a small number of shards, which could cause hotspots. In contrast, the Hash strategy allocates tenants to shards based on a hash of their tenant ID. This means that sequential tenants are most likely to be allocated to different shards, which will distribute the load across them. The previous figure shows this for tenants 55 and 56.
The three sharding strategies have the following advantages and considerations:
- Lookup. This offers more control over the way that shards are configured and used. Using virtual shards reduces the impact when rebalancing data because new physical partitions can be added to even out the workload. The mapping between a virtual shard and the physical partitions that implement the shard can be modified without affecting application code that uses a shard key to store and retrieve data. Looking up shard locations can impose an additional overhead.
- Range. This is easy to implement and works well with range queries because they can often fetch multiple data items from a single shard in a single operation. This strategy offers easier data management. For example, if users in the same region are in the same shard, updates can be scheduled in each time zone based on the local load and demand pattern. However, this strategy doesn’t provide optimal balancing between shards. Rebalancing shards is difficult and might not resolve the problem of uneven load if the majority of activity is for adjacent shard keys.
- Hash. This strategy offers a better chance of more even data and load distribution. Request routing can be accomplished directly by using the hash function. There’s no need to maintain a map. Note that computing the hash might impose an additional overhead. Also, rebalancing shards is difficult.
Related patterns and guidance
The following patterns and guidance might also be relevant when implementing this pattern:
- Data Consistency Primer. It might be necessary to maintain consistency for data distributed across different shards. Summarizes the issues surrounding maintaining consistency over distributed data, and describes the benefits and tradeoffs of different consistency models.
- Data Partitioning Guidance. Sharding a data store can introduce a range of additional issues. Describes these issues in relation to partitioning data stores in the cloud to improve scalability, reduce contention, and optimize performance.
- Index Table pattern. Sometimes it isn’t possible to completely support queries just through the design of the shard key. Enables an application to quickly retrieve data from a large data store by specifying a key other than the shard key.
- Materialized View pattern. To maintain the performance of some query operations, it’s useful to create materialized views that aggregate and summarize data, especially if this summary data is based on information that’s distributed across shards. Describes how to generate and populate these views.