THESE ARE VERY DIFFICULT QUESTIONS to answer. And although we think that infrastructure vendors understand our performance requirements, they often don’t and, unfortunately, they are incentivised to sell highly over-provisioned storage infrastructure.
Imagine instead, that next time you’re choosing your storage kit, you know where your data pinch points and I/O bottlenecks are. You’re confident your new system will handle your growth projections. And you’re certain the product you choose is giving you the performance you truly need, rather than a whole host of excess performance capacity that you’re paying for, but will never use.
The key is to have the right data to help you make intelligent purchasing and deployment decisions: decisions that optimise the cost/performance trade-off. It sounds obvious, but until recently accessing and analysing that data was practically impossible.
Fortunately, there is a new way to remove the guesswork from storage performance planning: performance validation - where workload modeling, load generation appliances and performance analytics can accurately simulate production workloads in your pre-production test and development environment and produce the data insights you need to make the right storage decisions.
In most companies, storage is rapidly becoming the most significant IT infrastructure spending component and also the largest source of application performance issues. The requirements of low cost and maximum performance are at odds with one another. Over provisioning clearly wastes valuable company funds, but under-provisioning can be equally bad, resulting in lost customers or non-productive employees.
Understanding the characteristics of each workload access pattern is fundamental to properly architecting storage systems, but until now there’s been a lack of insight into common applications’ actual storage performance requirements.
As an example, accessing high-resolution files, such as videos or images, often requires very high bandwidth and low response times. This puts extreme demands on storage infrastructure. Transaction processing applications typically involve small block sizes and relatively low aggregate throughput, but the latency requirements can be equally demanding, albeit in a different way, on the storage systems. It’s been almost impossible to characterize I/O patterns and find out what those demands really are.
It’s only natural that IT architects want to take advantage of the newest storage products and technologies to support the evolving needs of their business units. But to do that, they need an accurate assessment of how a storage system or technology, like flash storage or OpenStack storage protocols, will perform in their production environment – in advance of live production deployment. As the storage vendors don’t know your application workloads and are still learning about these new technologies, a vendor-independent storage performance validation process needs to be implemented. Storage infrastructure performance validation allows storage architects and engineers to deploy cost-effective and highly responsive storage systems that don’t experience unpredictable performance.
The process starts by creating I/O profiles of production workloads by gathering data from the logs or performance monitoring tools provided by your storage or switch vendor. This includes data on read/write ratios, data vs metadata command, random vs sequential, distribution of file or block sizes, queue depths, and other key metrics that define the I/O profile hitting your production arrays over a set period of time.
These I/O profiles can then be entered into the workload modeling application. Once in the model, parameters can be varied across nearly any dimension to find the optimal storage product and configuration. It enables storage architects to make direct apples to apples comparisons across any storage technology or product. It helps to identify the storage system’s performance limits for your workloads under a variety of conditions.
This means that the next storage upgrade can be a carefully planned event instead of a disruptive fire drill. And the storage architect can now perform ‘what-if’ analysis to answer questions such as what happens to response times:
If the read/write ratio changes from
40%/60% to 50%/40%?
If user requests increase by 2X, 5X or even
100X?
If block request sizes increase from 1KB to
8KB or 64KB?
If the distribution of file sizes changes from
primarily small files to a mix of small and
larger files
If the compression or deduplication ratios
increase from 2:1 to 5:1 or 10:1?
And many more ….
Lastly, storage infrastructure performance validation means it’s much easier to properly budget for future systems: providing IT planners with reliable information about a new project’s potential impact on current storage resources.
Storage workload modelling that truly reflects real-world applications, along with high performance load generation appliances, enables organisations to accurately emulate the scale and I/O profile of production workloads. That means they can find the limits of infrastructure before deploying in production. With a comprehensive characterisation of application workloads and storage infrastructure, IT planners and architects can dramatically improve the effectiveness of storage engineering across the entire storage lifecycle.
Adopting storage performance validation as an IT best practice, means storage infrastructure architects can:
Fully evaluate new storage technologies
like flash arrays, hybrid arrays, OpenStack,
CEPH, and converged architectures to
assess if they cost-effectively support key
business initiatives.
Determine which storage system or
vendor offers the highest performance
or is the most cost-effective when running
specific applications.
Optimise configurations such as the solid
state / hard disk trade-offs, tiering
strategies, caching implementations by
measuring their performance impacts.
Understand the performance impacts
of deduplication and compression on
workloads as each vendor’s algorithms
are unique and proprietary to their
products. Performance can easily vary by
2X to 5X depending on the vendor and
the workload.
Measure how server and storage
virtualisation affect I/O patterns and their
impact on performance.
Find the performance limits of new
storage systems before deploying them
into production.
Determine which protocol (e.g. Fibre
Channel vs. iSCSI) or version (e.g. NFS
3.2 or 4.1) is best for each application
workload.
Quantify the impact application upgrades
and firmware updates have on storage
systems and switches before deploying as
they can have a profound effect on latency.
Speed up problem identification and
isolate performance bottlenecks by
creating a verbatim replay of I/O traffic,
enabling the immediate understanding of
oot causes of problems.
While the uses and applications can be diverse, storage performance validation will empower storage architects and storage engineers to achieve their most critical goals of cost reduction, performance assurance and risk mitigation. When implemented as an IT best practice, the risk of organisational interruption due to performance slowdowns (or outages), will be radically minimised and storage purchasing will be fully aligned to performance requirements.