«Algorithms for Data Migration E. Anderson · J. Hall · J. Hartline · M. Hobbes · A. Karlin · J. Saia · R. Swaminathan · J. Wilkes Received: 15 ...»
Algorithms for Data Migration
E. Anderson · J. Hall · J. Hartline · M. Hobbes ·
A. Karlin · J. Saia · R. Swaminathan · J. Wilkes
Received: 15 August 2007 / Accepted: 16 July 2008
© Springer Science+Business Media, LLC 2008
The data migration problem is the problem of computing a plan for moving data objects stored on devices in a network from one conﬁguration to another.
Load balancing or changing usage patterns might necessitate such a rearrangement E. Anderson · R. Swaminathan · J. Wilkes Hewlett–Packard Laboratories, Palo Alto, CA 94304, USA E. Anderson e-mail: email@example.com R. Swaminathan e-mail: firstname.lastname@example.org J. Wilkes e-mail: email@example.com J. Hall · A. Karlin Department of Computer Science, University of Washington, Seattle, WA 98195, USA J. Hall e-mail: firstname.lastname@example.org A. Karlin e-mail: email@example.com J. Hartline Department of Electrical Engineering and Computer Science, Northwestern University, Evanston, IL 60208, USA e-mail: firstname.lastname@example.org M. Hobbes School of Engineering and Information Technology, Deakin University, Geelong, VIC 3217, Australia e-mail: email@example.com J. Saia ( ) Department of Computer Science, University of New Mexico, Albuquerque, NM 87131, USA e-mail: firstname.lastname@example.org Algorithmica of data. In this paper, we consider the case where the objects are ﬁxed-size and the network is complete. Our results are both theoretical and empirical. Our main theoretical results are (1) a polynomial time algorithm for ﬁnding a near-optimal migration plan in the presence of space constraints when a certain number of additional nodes is available as temporary storage, and (2) a 3/2-approximation algorithm for the case where data must be migrated directly to its destination. We also run extensive experiments on several algorithms for various data migration problems and show that empirically, many algorithms perform better in practice than their theoretical bounds suggest. We conclude that many of the algorithms we present are both practical and effective for data migration.
Keywords Data migration · Edge coloring · Load-balancing · Space constraints · Approximation algorithms 1 Introduction The performance of modern day large-scale storage systems (such as disk farms) depends critically on having an assignment of data to storage devices that balances the load across the devices or that optimizes a more complex cost function. Unfortunately, the optimal data layout is likely to change over time because of workload changes, device additions, or device failures. Thus, it is common to periodically compute a new assignment of data to devices [3, 5–7, 22], either at regular intervals or on demand as system changes occur. Once the new assignment is computed, the data must be migrated from the old conﬁguration to the new conﬁguration. This migration must be done as efﬁciently as possible to minimize the impact of the migration on the system. The large size of the data objects (gigabytes are common) and the large amount of total data (can be petabytes) makes migration a process which can easily take several days.
In this paper, we consider the problem of ﬁnding an efﬁcient migration plan. We focus solely on the off-line migration problem, i.e., we ignore the load imposed by user requests for data objects during the migration. Our motivation for studying this problem lies in migrating data for large-scale storage system management tools such as Hippodrome . Hippodrome automatically adapts to changing demands on a storage system without human intervention. It analyzes a running workload of requests to data objects, calculates a new load-balancing conﬁguration of the objects and then migrates the objects. An ofﬂine migration can be performed as a background process or at a time when loads from user requests are low (e.g. over the weekend).
The input to our migration problem is an initial and ﬁnal conﬁguration of data objects on devices, and a description of the storage system such as the storage capacity of each device, and the underlying network connecting the devices. Our goal is to ﬁnd a migration plan that uses the existing network connections between storage devices to move the data from the initial conﬁguration to the ﬁnal conﬁguration in the minimum amount of time. For obvious reasons, we require that all intermediate conﬁgurations in the plan be valid: they must obey the space constraints of the storage devices as well as usability requirements of the on-line system. (The migration Algorithmica process can be stopped at any time and the on-line system should still be able to run and maintain full functionality.) The time it takes to perform a migration is a function of the sizes of the objects being transferred, the network link speeds and the degree of parallelism in the plan.
A crucial constraint on the legal parallelism in any plan is that each storage device can be involved in the transfer (either sending or receiving, but not both) of only one object at a time.
Most variants one can imagine on this problem are NP-complete. The migration problem for networks of arbitrary topology is NP-complete even if all objects are the same size and each device has only one object that must be moved off of it. The problem is also NP-complete when there are just two storage devices connected by a link, if the objects are of arbitrary sizes.1 We will always make the following assumptions.
1. The data objects are all the same size;
2. There is at least one free space on each storage device in both the initial and ﬁnal conﬁgurations of the data;
3. There is a direct bidirectional link between each pair of devices and each link has the same bandwidth; and
4. Every device has the same read and write speed.
The ﬁrst assumption is quite reasonable in practice if we allow ourselves to subdivide the existing variable sized objects into unit sized pieces, since the time it takes to send the subdivided object is about the same as the time it takes to send the entire object. The second assumption is also reasonable in practice, since free space somewhere is required in order to move any objects, and having only one free space in the entire network would limit the solution to being sequential.
The third and fourth assumptions are reasonable if we assume that the storage devices are connected in a local area network (SAN), for example in a disk farm. The third assumption is reasonable since the topologies in SANs are generally close to being complete. The fourth assumption is reasonable since an organization building a SAN of storage devices would typically buy devices which have similar performance characteristics.
With these assumptions, we are led to describe the input to our problem as a directed multigraph G = (V, E) without self-loops that we call the demand graph.
Each node in the demand graph corresponds to a storage device, and each directed edge (u, v) ∈ E represents an object that must be moved from storage device u (in the initial conﬁguration) to storage device v (in the ﬁnal conﬁguration). An example of how a demand graph is deﬁned based on an initial and goal conﬁguration is given in Fig. 1.
Since we are considering ﬁxed-size objects, our migration plan can be divided into stages where each stage consists of a number of compatible sends, i.e., each stage is a matching. Thus, we can observe that the special case of our problem when there are no space constraints on the storage devices, and sends must be direct, is precisely the multigraph edge coloring problem (the directionality of the edges becomes irrelevant
Fig. 1 An example demand graph. v1, v2, v3, v4 are devices in the network and the edges in the ﬁrst two graphs represent links between devices. a, b, c, d are the data objects which must be moved from the initial conﬁguration to the goal conﬁguration and the colors of the edges represent the time steps when objects are sent). This problem is of course NP-complete, but there are good approximation algorithms for it, as we shall review in Sect. 1.4. The storage migration application introduces two interesting twists on the traditional edge coloring problem: edge coloring with bypass nodes and edge coloring with space constraints. We review these variants brieﬂy in the next section.
1.1 Data Migration Variants 1.1.1 Bypass Nodes In the ﬁrst variant on edge coloring, we consider the question of whether indirect plans can help us to reduce the time it takes to perform a migration. In an indirect plan, an object might temporarily be sent to a storage device other than its ﬁnal destination. It is easy to see that this might reduce the number of stages in the migration plan. As a ﬁrst step towards attacking the problem of constructing near-optimal indirect plans, we introduce the concept of a bypass node. A bypass node is an extra storage device that can be used to store objects temporarily in an indirect plan. (In practice, some of the storage devices in the system may either not be involved or will be only minimally involved in the migration of objects and these devices can be used as bypass nodes.) A natural question to then ask is: what is the trade-off between the number of bypass nodes available and the time it takes to perform the migration? In particular, how many bypass nodes are needed in order to perform the migration in (G) steps, where (G) (or where G is understood) is the maximum total degree of any node in the demand graph G. ( is a trivial lower bound on the number of steps needed, no matter how many bypass nodes are available.) An optimal direct migration takes at least χ parallel steps where χ is the chromatic index of the demand graph (the minimum number of colors in any edge coloring). If our solution is not required to send objects directly from source to destination it is possible that there is a migration plan that takes less than χ stages. In general, our goal will be to use a small number of bypass nodes, extra storage devices in the network available for temporarily storing objects, to perform the migration in essentially stages.
Algorithmica Fig. 2 Example of how to use a bypass node. In the graph on the left, each edge is duplicated k times and clearly χ = 3k. However, using only one bypass node, we can perform the migration in = 2k stages as shown on the right. (The bypass node is shown as “◦”) Deﬁnition 1 A directed edge (v, w) in a demand graph is bypassed if it is replaced by two edges, one from v to a bypass node, and one from that bypass node to w.
An important constraint that bypassing an edge imposes is that the object must be sent to the bypass node before it can be sent from the bypass node. In this sense, edges to and from bypass nodes are special. Figure 2 gives an example of how a bypass node can be used.
By replicating the graph on the left side of Fig. 2 n/3 times, we see that there exist graphs which require n/3 bypass nodes in order to complete a migration in steps.
1.1.2 Space Constraints
When space constraints are introduced, we obtain our second, more complex, variant on the edge coloring problem. We consider here the limiting case where there is the minimum possible free space at each node, including bypass nodes, such that there is at least one free space in both the initial and ﬁnal conﬁgurations. We can deﬁne this
problem more abstractly as the edge coloring with space constraints problem:
The input to the problem is a directed multigraph G, where there are initially F (v) free spaces on node v. By the free space assumption, F (v) = max(din (v) − dout (v), 0) + 1, where din (v) is the in-degree and dout (v) is the out-degree of node v.
The problem is to assign a positive integer (a color) to each edge so that the maximum integer assigned to any edge is minimized (i.e., the number of colors used is minimized) subject to the constraints that
• no two edges incident on the same node have the same color, and (i) (i) (i) (i)
• for each i and each node v, cin (v) − cout (v) ≤ F (v), where cin (v) (resp. cout (v)) is the number of in-edges (resp. out-edges) incident to v with color at most i.
The second condition, which we refer to as the space constraint condition, captures the requirement that at all times the space consumed by data items moved onto a storage device minus the space consumed by data items moved off of that storage device cannot exceed the initial free space on that device.