Why running a data centre is like running a marathon

Eric Jorgensen, VP of regional sales EMEA at Virtual Instruments, is running a marathon in April. In this article Eric describes the similarities between preparing for the event with managing a complex data centre.

11 years ago Posted in

IF YOU WERE TO DRAW a line graph of world records for the fastest marathon time over the past 100 years, it would be top-left to bottom right. Why? There are multiple reasons. Firstly, athletes train smarter than before: they embark on strict fitness programmes and diets which, when combined, optimise their performance. Secondly, there are more runners participating than ever before, making the competition fierce and leaving no space for complacency. Thirdly, today’s athlete has access to state-of-the-art technology which monitors performance, both retrospectively and in real-time.

Running a data centre is much the same: businesses are smarter and more efficient than before. Gone are the days when throwing extra resources at the data centre would suffice. Instead, today’s IT managers need to carefully consider where they allocate their resources if they are to optimise performance.

The necessity of looking at the whole IT infrastructure is due to a combination of new pressures brought about by growing volumes of unstructured data, and ever-evolving, complex IT environments that are continuously impacted by virtualisation, cloud computing, BYOD, rich media and video streaming to name just a few. The traditional ways of coping with an increased need for performance are no longer acceptable, and costly over-provisioning is no longer an option to address performance spikes or degradation. Managers want to stop wasting their budgets and other resources and demand to know what’s happening across the entire data centre.

A professional runner can only succeed on the world stage by relying on technology and tools that monitor their performance. The same is true for an IT infrastructure. The tools that give visibility into what’s under the IT systems hood are already available from most vendors although the insight they provide is limited. The problem is that these vendor-specific tools are biased as they are restricted to that particular vendor’s products. This limits the level of monitoring and trouble shooting that can be done across the entire environment, and does not enable an accurate, unbiased, system-wide view across the infrastructure as a whole.

Most organisations try to manage their infrastructures using the tools provided by their virtual server, fabric and array providers. The problem with such tools is that they do not provide information in real-time because if they did it would result in latency as they would poll the devices too often. This means that the reports generated are based on an average over time from information gathered usually at one to five-minute intervals. The other issue with device dependency is that the vendors operating the virtual systems can take weeks to report on service levels as getting logs from each device can take days.

The Virtual Instruments VirtualWisdom IPM platform can be easily installed on the back of Traffic Access Points or TAPs, it is compatible with all vendor hardware and has won several awards for its non-disruptive, real-time monitoring and analysis capabilities. Most data centres have built-in redundancy to cope with spikes in demand for capacity: by using VirtualWisdom, IT managers can see precisely where latency and performance degradation occurs, and they can proactively identify and address any traffic glitches before they become major issues. With VirtualWisdom this granular and comprehensive insight into the IT infrastructure is achieved without impacting application performance or end user experience and it doesn’t add any ‘load’ on to the system as polling does. Once implemented VirtualWisdom reads the Fibre Channel protocol in real time, end-to-end regardless of vendor equipment.

By having in-depth levels of insight data centre managers can truly plan and manage their IT environments, and ensure that they remain perpetually agile. In a similar manner, with VirtualWisdom, IT infrastructure managers can see how applications are performing, spot any existing and upcoming bottlenecks, review the level of utilisation of each component and understand whether these can be optimised to reduce cost. Once armed with this information, it is possible to eliminate downtime and latency, create application-aligned SLAs, and set in motion new initiatives such as tiering to ensure that each component (server, switch, fabric, storage) is perpetually optimised, and the IT infrastructure is therefore running as cost effectively as possible.

A marathon runner 100 years ago would have had heavier trainers than his counterpart today; he or she may have focused on one area of performance while ignoring another. Without monitoring tools they would almost certainly over provision in one area while neglecting another.

Likewise, if you don’t TAP and monitor your infrastructure, you are like the athletes of old; over provisioning, spending budget unnecessarily and allocating resources to the most important applications whether they need them or not. You are also susceptible to unplanned latency and outages. No runner today would train without their monitoring in place, they just wouldn’t be competitive enough, and likewise today’s leading businesses have had their data centres tapped because this gives them a huge competitive advantage. Why would anyone want to build a system they have no visibility into, and that can’t guarantee application performance?

Today’s professional marathon runner will have a team of dedicated specialists working to improve their performance. The dietician, the coach, the physiotherapist will all work in harmony, responding to analytics to ensure optimal performance is achieved. Similarly to a team of sports specialists, different people within IT need different views of what is going on in an infrastructure. But what is happening currently in many organisations is that, instead of the various teams having different views of the same data set, they have different perspectives of their own data.

This is a historical issue that creates challenges between what certain people “see” in the storage network, leading to unnecessary finger pointing. This is limiting as it does not provide the entire team with visibility into the bigger picture. The database administrator, for example, might experience performance problems, but the tools used to assess database performance are quite different and often unfamiliar to the team that gets data from the storage. A common language is key in this regard.

The storage team also doesn’t know what the end user experience is like because that’s the job of the application guy, and vice versa. If they all work with one end-to-end, real-time view of the infrastructure then every team and the business itself will immediately see extensive benefits.

As with preparing for a marathon, running an IT infrastructure requires dedication, endurance, communication and, to perform well, continual improvement. In our modern world in which technology has facilitated progress, there is no room for complacency.