Data is almost always on the move. That's especially true given that storage systems have typically been refreshed almost every 3 years since we started creating data, and now with cloud storage offerings, it's on the move again. Will data finally find its perpetual resting place in the cloud? Not likely. There are always going to be drivers that push data to different locations. These drivers, mostly cost and performance related, evolve over time as needs change and market offerings improve.

So if data isn't resting, what is it doing? Technically, data is always in one of three states:

  • Data at rest. Data that is housed physically on computer data storage in any digital form and that includes both structured and unstructured data.
  • Data in motion (data in transit, data in flight). Digital information that is in the process of being transported between locations, either within or between computer systems.
  • Data in use. Active data that is stored in a nonpersistent digital state, typically in RAM, CPU caches, or CPU registers.

While data is "resting" for periods of time, it is in persistent storage (storage that keeps data there, even when it's turned off), and this storage costs money to maintain. It also has a significant impact on the environment. As NetApp Chief Technology Evangelist Matt Watts wrote, "Currently, two thirds of the world's data isn't used. We call this 'digital waste.' And what we need to understand about data is that it has mass, it has to physically exist somewhere. This means that all data has a carbon footprint."

So the drivers to set data in motion are not just cost and aging storage infrastructure-there is also the desire to move, or "tier," to more ecofriendly environments such as shared services infrastructure like the cloud.

For this and other reasons (primarily cloud adoption), a lot of data is being set in motion. This activity is called data migration-the process of selecting, preparing, extracting, and transforming data and (typically) permanently transferring it from one computer storage system to another.

Although moving data may seem like an easy task, there are so many varieties of data storage systems and their file systems that a simple operation like moving data may require more than just a Move or Copy command. Also, a lot of this data is mission critical to current business operations and may require a nondisruptive data migration. Finally, there's the sheer volume of the data. We have progressed from megabytes in the 1980s to petabytes and beyond today, so the majority of data migration challenges are associated with the time it takes to complete them.

We know that we need to migrate data, that the source and destination will always be arbitrary, that there will be various sizes to deal with, and that we want to do it in a nondisruptive fashion. To achieve all of these requirements from a technology perspective, we typically use data replication technology. Data replication is typically extendable across a computer network and creates an exact replica at the destination. This technology has typically been used for off-site disaster recovery, but it is increasingly used for data migration because it can be done non disruptively. Data replication technology also ensures that all data and changes are being replicated over the time period of the migration activity.

Attachments

  • Original Link
  • Original Document
  • Permalink

Disclaimer

NetApp Inc. published this content on 09 December 2021 and is solely responsible for the information contained therein. Distributed by Public, unedited and unaltered, on 10 December 2021 09:41:01 UTC.