The avalanche of data and exploding costs in the data centre

How data virtualisation influences the overall performance of a data centre By Ash Ashutosh, CEO, Actifio.

  • 9 years ago Posted in

Companies that use data centres today face a number of challenges. The data centre is subject to a radical technological change. The storage media trend is moving away from tape and hard disk drive (HDD) to flash memory (SSD), while some companies opt for a mixed mode. The complexity of the IT landscape in the data centre increases regardless. At the same time the cost pressure rises so that compute, storage and network resources need to be consolidated. Energy costs are an important issue and it is important to make the operation of the data centre more efficient. This is all the more urgent as the avalanche of data grows and the vast memory requirements are significantly reflected in operating costs. At the same time, companies want to draw valuable insights from this huge mountain of ‘Big Data’.


Then there is the issue of security. The huge amounts of data must be accessible but secure. This is a difficult balancing act, one that becomes more complex with more data copies floating around, leading to the increase of the "attack surface". So, if fewer copies are created, the number of security-related targets would reduce. There would also be lower administrative and operating costs when the data growth could be curbed by efficient management.

The volume of data grows daily, not because of new data, but rather by the unchecked proliferation of data copies. But where does the flood of data copies come from? Multiple copies of data are generated in separate silos for different purposes such as data backup, disaster recovery, test, development, analysis, snapshots or migrations. In a 2013 study, IDC estimated up to 120 copies of specific production data can circulate by company, whereby the cost of managing the flood of data copies, reached 44 billion dollars worldwide. 85 percent of the storage hardware investment and 65 percent in storage software are owed by excess data copies according to the study. As a net result, the management of this issue within a company is now taking more resources than the management of the actual production data.


Stem the flood with Data Virtualisation

The virtualisation of data copies has proven to be an effective measure to make data management more efficient. By integrating data de-duplication and optimising the network utilisation, efficient data handling is possible. Since less bandwidth and memory is required, very short recovery times can be reached.

A possible principle is the use of a so-called "virtual pipeline", a distributed object file system in which the fundamentals of data management - copying, storing, moving and restoration - is virtualised. With this approach, virtual copies can be time-specific data from the collection of unique data blocks at any time. When data must be restored, the underlying object file system from the Copy Data Management solution is then extracted and analysed on a user-defined recovery point in any application format. The fact that the recovered data is mounted directly on a server, no data movement is required, which also contributes to quick recovery times. The recovered data is then immediately available.


More efficiency in data handling

The virtual data pipeline is used to collect, manage and provide data as efficiently and effectively as possible. After creating and storing a single complete snapshot, only the changed blocks of application data are detected by by using Change Block Tracking with an incremental-forever principle. Data is collected at the block level as this is the most efficient way to track changes and transfer. The data will be used and stored in its native format. There is no need to create or restore data from backup files as it can be managed and both used in an efficient manner.

The data is recorded on the basis of Service Level Agreements (SLAs) that are set by the administrator. These include the frequency of the snapshots, the type of memory in which to store them and the duration of storage. Theses SLAs could also be set if they are to be replicated to a remote location or at a cloud service provider. Once an SLA is created, any application or virtual machine can access the data.

For the data management element, a "golden” image or a "master copy" of the production data, based on the constantly updated SLA, is held. This "golden copy" is always available. A copy of selected production data for testing, development, analysis, etc. can be provided within minutes without affecting production. The "golden copy" can also be transferred to an outsourced location for disaster recovery.


Positive impact on the data centre

The virtualisation of data copies in the data centre relieves production systems and supports data backup, disaster recovery and business continuity. As for the server backup, the conventional NDMP backup server becomes obsolete. Full backups are no longer necessary as the mount images of an arbitrary point in time is possible at any time. For long-term storage of images with Copy Data Management solutions efficient deduplication and compression are used. Older images can also be mounted easily and lost or damaged data can be recovered within a minutes.

Copy Data Management also supports disaster recovery requirements within the business. Data can be replicated to a remote location using different technologies according to the request of the company. Synchronous or asynchronous LUN mirroring is also possible. With special Dedup-Async replication technology required for the data recovery, bandwidth can be significantly reduced. For Dedup-Backup replication, only the required block that is already deduplicated and compressed has to be transferred for long-term storage over the WAN.

Copy Data Management also relieves the primary storage as copies of data that can be stored separately. The primary storage manages the performance requirements of the production data, while the Copy Data Management solution takes care of all other data issues.

In application development, the testing and analysis of Copy Data management also plays to its strengths. Mounting virtual images on any device takes place immediately without the need to create a complete copy. This results in a lower memory access. Also, a clone a copy can be created when a new production version is required or for pre-production.


Copy Data virtualisation pays off

Copy Data virtualisation can be used for a subset of the data backup in coexistence with existing applications and infrastructure. The greatest efficiency potential unfolds when existing isolated silo systems can gradually be eliminated. A uniform Copy Data Management Platform acts then as an
efficient platform for data management and solves the growing problem of more and more redundant data copies in an elegant manner.

The companies which prefer to catch the rapid growth of data by investing in additional storage hardware will sooner or later be thwarted by the cost explosion. SSD as a new, efficient but still expensive storage technology should not be flooded with redundant copies of data towards capacity limits. With Copy Data virtualisation, data management is compressed and freed of redundancies. This gives the available memory resources air to cope with data growth in the coming years. At the same time, fast and reliable data availability and improved business continuity are ensured. All this is accompanied by reduced operating costs for the data centre. 

Exos X20 and IronWolf Pro 20TB CMR-based HDDs help organizations maximize the value of data.
Quest Software has signed a definitive agreement with Clearlake Capital Group, L.P. (together with...
Infinidat has achieved significant milestones in an aggressive expansion of its channel...
Collaboration will safeguard HPC storage systems and customer data with Panasas hardware-based...
Peraton, a leading mission capability integrator and transformative enterprise IT provider, has...
Helping customers plan for software failure, data loss and downtime.
Cloud Computing and Disaster Recovery specialist, virtualDCS has been named as the first UK-based...
SharePlex 10.1.2 enables customers to move data in near real-time to MySQL and PostgreSQL.