Most organizations are keeping an everincreasing amount of data, yet only a small amount of that data is live or ‘hot’. Some of the rest is occasionally accessed, but the vast bulk is ‘cold’ – data that’s rarely if ever accessed but is kept in case it is required in the future. It might be needed as evidence if a dispute arises, say, or for regulatory compliance, or because external market changes mean we need to analyze something we could formerly ignore.
Keeping that cold data on the same storage as the hot data wastes capacity and cost – and time too, if it means you repeatedly backup cold data that doesn’t need it. Yet for many organizations, sorting the hot from the cold is like trying to unmix a warm bath. Others know differently, and have resorted to a range of approaches in order to move cold data to cheaper locations.
In this paper we look at the challenges that cold data presents, at techniques and technologies that can help with the problem, and at the advantages organizations can gain from a smarter approach to data management.
The cold data challenge
It’s been claimed that data is the new oil, and that it may be the most valuable thing that an organization owns. Partly as a result of these claims, partly because almost everything now generates a data-trail, and partly just because they can, organizations are accumulating more and more data. Worse, the data growth rate for many organizations is exponential, rather than linear.
Cutting the storage knot
There are three necessary steps or capabilities before you can look at reducing storage costs by solving the cold data challenge:
- Finding out what data you have,
- Gaining visibility into it,
- Taking action on what you learn.
It is important to remember though that ‘the cloud’ is not the answer to every data challenge. There’s a saying that if all you have is a hammer, then everything looks like a nail, but as an IT or storage professional you have a complete toolkit. Business managers’ fascination with public clouds might well be your catalyst for change and the availability of funding, but depending on the use-case, modern on-site storage or private cloud may be the more practical and/or cost-effective option.
Cold data management: a real-world example
As our example we’ll use Komprise Intelligent Data Management, from the eponymous sponsor of this paper. While nothing we say here should be interpreted as endorsing or recommending this solution, talking around a specific offering enables us to move beyond the theory, and illustrate how some of the key principles we have been discussing can be translated into operational reality.
The business needs to take control of spiraling data volumes and storage costs, but of course it must do so with minimal change and disruption. A solution such as Komprise Intelligent Data Management can meet these needs by combining data analysis tools with policy-driven data movement and replication capabilities, all deployed as a grid of virtual appliances.
A key feature that can help with this storage evolution is what Komprise calls its Transparent Move Technology™ (TMT), which uses industry standard symbolic links to transparently redirect file access to the secondary store. These links are dynamic and resilient – they can be rebuilt if need be, and allow the data to be moved multiple times without changing the symlink on the primary store. This allows Komprise to be storage-agnostic and to sit outside the primary (hot) data path, with no need for software agents and the like. Moved files remain accessible on the secondary store even if Komprise is offline.