Running multiple applications in a virtualised environment with Flash/SSD

The disparity between the performance of CPU-based and disk-based storage continues to grow. Applications that are Web 2.0, mission-critical, I/O intensive, virtualised and clustered continue to put an additional burden on processors and slower storage, which lowers overall application performance. These increased performance demands on the enterprise data centre require application performance improvements using innovative solutions that can scale across the enterprise infrastructure. By Warren Reid, Director Marketing, EMEA at Dot Hill.

11 years ago Posted in

FLASH/SSD is still an emerging technology and therefore has some way to go before being recognised as a self-standing, viable option. There has been a lot of excitement around its performance potential and, for certain businesses its deployment can indeed improve productivity and revenues. As a result, it makes sense for users to evaluate flash-based storage solutions for applications where I/O is king such as databases, online banking, or large financial services applications, flash is an appropriate option. The technology is even more appealing when you look at TCO: moving from a HDD-only solution to an SSD one can realise a performance improvement in the order of 30-60 per cent with a cost per IOP which is 1/3 less than an equivalent system based on traditional disk.

However, the cost per TB of capacity for SSD is far higher than HDD leaving it out of reach of many smaller businesses. The key to successful deployment is selecting the right data sets to reside on this expensive high performance storage layer and choosing the most appropriate deployment method to meet application performance and availability requirements.

Flash/SSD deployed in a server
The use of Flash/SSD memory was pioneered in the industrial and military markets and today represents the default choice for achieving high performance, low latency data access for demanding applications and virtualised environments.

One of the most popular ways of deploying Flash/SSD based storage to provide application acceleration is to install this technology directly within a physical server using the PCIe bus interface. PCIe based SSDs for uses in enterprise server acceleration have been shipping in the market since 2007.

Essentially the closer the Flash/SSD storage resides in relation to the server CPU and memory architecture, the lower the access latency will be, which in the case of current PCIe based products will be measured in micro seconds, typically 15µs for write operations and 50-100µs for read operations. Due to the capacity limitations of Flash/SSD devices only a limited number of applications could source all of their operational data from this storage layer and hence Flash/SSD is usually configured as an ultra-low latency storage cache fronting the slower traditional HDD based data stores either residing in the server or out on a SAN.

Flash/SSD technology deployed in this manner generally functions without any sophisticated intelligence that can ensure the most in-demand data sets are ‘always’ available from the high performance Flash/SSD storage layer, instead caching algorithms store the most recent transactions before de-staging them to HDD. A key consideration with sever deployed Flash/SSD is ensuring that the server does not jeopardise business continuity by becoming a single point of failure for the storage subsystem. Many vendors are developing ways to make server based Flash/SSD available to other servers within the data centre in order to create resilience through clustering. This clustering capability may be implemented within the network infrastructure as we will discuss later or within the OS or VM kernel.

Flash/SSD deployed in the network infrastructure
We have outlined many of the key limitations of Flash/SSD deployment within a server. These included the local direct connectivity of Flash/SSD storage which forms a storage silo and limits or complicates the access to this internal storage within a networked or clustered environment; the dependence on availability of compatible drivers for each operating system and hypervisor that will access the cache, and the localised nature of the server installed architecture which can present a single point of failure.

A new approach which aims to overcome such limitations is being undertaken by network infrastructure equipment manufacturers who are coupling PCIe based Flash/SSD storage with networking host bus adapters (HBA). In this scenario the Flash/SSD is still contained on a PCIe card which is mounted within a server except it is decoupled from the operating system and controlled by the HBA firmware. With this implementation there are no cache specific drivers required by the OS or hypervisor as the PCIe based cache is controlled transparently through the existing HBA software stack, therefore no additional drivers are required other than the driver for the HBA card itself.

The Flash/SSD storage is now effectively residing within the SAN infrastructure and can therefore be more readily made available to other servers connected to the SAN fabric.

This can simplify the migration of virtual machines from one physical server to another as any cached data associated with their operation can be more easily transferred across the SAN to a Flash/SSD PCIe card residing within the new physical machine. There are also vendors looking to extend the PCIe interface itself as a data centre and cloud computing fabric by adding standards-compliant extensions that address multi-host and I/O sharing applications.

This approach may represent the future topology for implementing Flash/SSD based acceleration within clustered and networked environments where shared access to storage resources is a priority. The industry is actively searching for an elegant answer to the scalability problems caused by the increasing adoption of PCIe based Flash/SSDs which will allow these expensive resources to become available to more servers and enable a simple to implement failover mechanism so that data remains accessible in the event of a server or SSD fault.
Flash/SSD deployed in an all-flash/SSD networked storage array
There have been many start-up technology companies formed over the past 5 or more years that have been focused on developing all-flash/SSD based networked storage arrays. Their products tout the highest data throughput and lowest latency figures for networked storage arrays in the industry. Of course storage performance is not the only value proposition, technologies within these products strive to maximise the service life of flash/SSD devices through advanced wear-levelling algorithms and techniques such as compression and de-duplication are common to ensure the most efficient use is being made of this most expensive storage medium. As the name suggests ‘all-flash/SSD’ storage arrays have the potential to provide great performance but at a cost. There are several considerations that a user should evaluate with regard to deploying an all-flash/SSD networked storage array.

The fundamental question is whether the business application(s) truly need flash/SSD performance in order to meet service or response levels and if this is true then the next question is how big the data sets that require this level of performance are. This second question is very important because flash/SSD is expensive and due to the high performance of these devices it does not take many Flash/SSD drives to be deployed in a networked storage array before the bottleneck becomes the storage controller or the network interface. So if data throughout measured in IOPs is your goal you must be aware of the point at which adding more flash/ drives delivers no further returns in this dimension. If low latency is your goal than an all-flash/SSD array will always deliver on that score.

Many of today’s traditional HDD based networked storage arrays allow the inclusion of SSD drives within their enclosure in a hybrid configuration so this route may be sufficient if the data sets needing maximum performance are relatively small. Both all-flash/SSD and hybrid options can cater for specific applications that must have the maximum storage performance on tap but for most users their data storage workloads are very dynamic and becoming more so with the increasing use of virtualisation, the challenge has become one of ensuring that any data set that requires maximum storage performance has access to flash/SSD technology at the instant that it is required without breaking the bank by deploying an entirely Flash/SSD based infrastructure. This is where a hybrid solution with intelligent auto-tiering can come to the rescue.

Flash/SSD deployed in a hybrid storage array
Some users swear by capacity, others by performance; but what happens when they need both concurrently? Or when they need one or the other depending on where in their calendar they are? As is often the case in IT and in life generally, circumstances call for different shades of grey rather than black or white. For example a business might function well on a disk drive environment most of the time but when it comes to the end of the month, quarter or year, it might need a faster infrastructure to cope with spikes in performance demand. In these data centres users can benefit from combining a small percentage of Flash/SSD (usually 5-10%) with HDDs controlled by an intelligent autotiering solution to accelerate application response times.

Autotiering solutions have been available from the major storage system providers for the past 2-3 years. These systems will usually consist or two or more tiers of storage combining Flash/SSD with various levels of HDD. At the core of an autotiering solution will be an algorithm which analyses incoming data access requests over a period of time in order to determine the most in-demand data sets over the analysis period. The aim is to ensure that the most in-demand data is migrated to the high performance Flash/SSD tier while less in-demand data is positioned on the most appropriate HDD layer based on the frequency of access. Hybrid systems that include autotiering vary with regard to their level of complexity and the amount of administration required by the end user.

For some, the policies that determine the behaviours of the tiering algorithms are open to configuration by the user, which of course requires a certain level of storage administration expertise, whereas some will offer a fully automated operation. Of course any form of automation will involve a level of performance trade-off compared to a manually configured system, which has been ‘tuned’ for optimum performance for a specific application. For most environments the benefits in terms of speed of deployment and on-going administration that come with a fully automated system will outweigh the advantages that might be gained through user defined configuration of tiering policies.

One crucial aspect of an autotiering system that must be understood is the frequency with which the system analyses incoming data requests and the period of time before the system starts to migrate data between the various tiers based on the results the data analysis.

Many systems perform their data migration as a batch process, which takes place several hours after the data monitored during the analysis stage has been served to the requesting users and/or applications. Systems operating in this manner result in moving historical data between the various storage tiers, which in most cases, serve no benefit to the end user in terms of improved performance or in efficient use of their investment in Flash/SSD technology.
In this scenario data which was in high demand historically is moved onto Flash/SSD at a later stage when it may no longer be in-demand and therefore no longer in need of Flash/SSD levels of performance. This form of tiering and batch migration process will only benefit applications which have a constant high demand on a particular data set as this will be moved into the high performance tier for increased performance.

Most real-world environments will have very dynamic data access patterns combining both machine generated workloads with data access derived from human activity which is compounded even more when applications are aggregated onto one or more servers running in a virtualised environment. For these typical modern-day dynamic environments it is vital that data analysis and migration are performed in as close to real-time as possible so that users can benefit immediately from the increased performance of Flash/SSD.

Conclusion
It is clear that Flash/SSD offers a solution to the bottleneck that traditional HDD based storage presents when faced with the demands of today’s powerful servers running multiple applications in a virtualised environment. The lowest latency and most rapid access to data can be achieved by moving the storage elements as close electronically to the CPU as possible, but as we have illustrated moving the storage from the inherently resilient SAN back into the server is not without pitfalls in terms of reliability and complexity.

For each potential application of Flash/SSD technology the user needs to evaluate the true level of performance required, the size of the data sets demanding such performance and whether or not the data sets in question will be the one and only sets demanding these levels of performance; or if the workload could shift dynamically over time. For certain applications such as specific database acceleration, then a server based or all-flash array solution may be the optimum choice.

For most general applications a hybrid solution utilising intelligent real-time autotiering will deliver the performance required for the data sets that demand high performance at the precise moment those demands are received. The challenge for the user is one of balancing performance, cost, complexity and reliability to determine the solution that best fits their data workload demands.