Architecture is sexy. The proof? Brad Pitt loves architecture. Brad Pitt has twice been named “Sexiest Man Alive.” Donald Trump’s garish, obnoxious buildings indicate that he hates architecture. Donald Trump has never been named “Sexiest Man in the Room,” even when he’s alone.
Do you need more proof? Check out the EMC World Backup and Recovery Keynote with Guy “Incredible Hulk” Churchward. Data Protection Architecture is sexy. Well, to some anyway.
In the first installment of this series, I provided the overview of the Protection Storage Architecture. Over the next three installments, I’ll address each core architectural element in more depth.
Protection Storage – Why is it Important?
Why do you need protection storage? Even in the cloud, you can’t store your protection data in thin air. You need media for storing data.
The choice of protection storage is important because hardware evolution drives software architecture and innovation (e.g. multi-core processors led to VMware, NAND memory led to the iPad, etc.). The architectural limitations of traditional backup software stem from designing for tape as the hardware anchor (those constraints are why I criticize tape):
- Slow/Expensive. If you ever plan on recovering data from tape, you must run regular full backups. Unfortunately, full backups overload the primary storage, application server and network.
- Vendor Lock-in. You cannot recover data from tape without the original backup application because it writes data in a proprietary format. Backup software justifies the need to store data differently on tape than on disk because tape is a sequential, offline media (of course, that doesn’t explain why each vendor needs a different format).
- Poor Service Catalog. “Sorry, I can’t set up a separate schedule for you – I’ve got to balance the load on the tape drives.” “Sorry, I can’t eliminate just that sensitive data from the backups; it’s on shared tape.” “Sorry, I need to find an older drive that can read that tape.” The common theme – service in a tape environment is sorry.
Protection storage is the anchor of the architecture, so how do you make the right choice?
Protection Storage – Cost-Optimized, Durable, Multi-Use
In storage, one size does not fit all. The design center for protection storage is different than scale-out NAS, transactional storage or geo-distributed object. The unique requirements for protection storage are:
- Data Durability. Since protection spans disaster recovery, backup, and archival, you need storage that ensures the data will be there – potentially decades later. First, you need the layers of RAID, disk scrubs, checksums, and a file system built to be resilient to errors (to paraphrase our old slogan, “Tape sucks, but disks without the right storage stack aren’t so great, either”). Second, you need storage that has life beyond your current hardware purchase. Like the historical society said, “We’re preserving Abe Lincoln’s original axe – a priceless legacy! That’s why we’ve replaced the handle twice and the head once.” You’ll replace the compute and storage components of your protection storage over the lifespan of the data, but still need to preserve the data. We call this the “Data Invulnerability Architecture.”
- Multi-Use. Organizations cannot afford the capital and operational expense of three separate storage solutions for backup, disaster recovery and archival. Conversely, they cannot afford a “least common denominator” approach that does not meet the RPO/RTO/compliance needs (e.g., everything on tape). Converged protection via versioned replication is the best way scale protection performance with data growth. Therefore, customers need to select protection storage that can support versioned replication to unify backup, disaster recovery, and archival.
- Cost-Optimized. Price matters. The first step toward cost optimization is space optimization – reducing the number of copies (e.g., backup, DR, archive convergence), the overhead of storing multiple copies (deduplication) and the footprint of each copy (compression).The second step is developing software to leverage lower-cost hardware components. This includes scaling performance with CPU and capacity with low-cost, large-capacity storage… while not compromising data durability and space optimization.
A well-designed protection storage platform differs from other types of storage. While some users try to deploy a “good enough” solution with a generic storage platform, it does not meet their evolving needs. Protection has a unique design center: cost-optimized, multi-use, highly durable storage.
Protection Storage – Pools Not Silos
When protection began to affect business, vendors from multiple layers in the stack (e.g., applications, hypervisor, and primary storage) added mechanisms to optimize their cost/protection. Unfortunately, this led to the accidental architecture – each team deploying its own protection solution.
As companies streamline their infrastructures, they focus on eliminating silos. Data protection groups need to consolidate the protection storage silos into protection storage pools, or data lakes. Two key requirements:
- Multiple Data Sources. Protection teams must support data generated by backup application clients, primary applications (e.g., SQL dumps), primary storage (e.g., snapshots, clones, replicas), and hypervisors. The set of protection protocols includes: tape (or VTL), NAS, OST, Data Domain BOOST, iSCSI, FCP and object interfaces. The protection storage needs to support a variety of workloads: traditional file server backup (weekly full, nightly incremental), VMware CBT backup (incremental forever), the assortment of Oracle mechanisms (e.g. RMAN SBT, incremental merge, multiplexing) and more.
- Evolution. The set of protection workflows will continue to increase. While many vendors act as if customers will “rip and replace” their old infrastructure, most protection environments look like a geological dig –multiple layers of active, historical solutions. While switching the backup software can be very difficult (the “vendor lock-in” point), at a minimum you should consolidate the protection storage. Therefore, select protection storage that can handle the variety of workloads in your environment today (mainframe, iSeries, VTL, disk-centric backup, etc.) and evolve to meet tomorrow’s workloads.
While it is unlikely that any company will fully consolidate into one protection storage offering (there’s always a one-off somewhere), the protection team can select an anchor platform that becomes the core of virtually every offering – today and in the future.
Protection Storage – Your Suit and Tie
When designing your protection architecture, the protection storage is your most important choice. Hardware drives software architecture and design, and the wrong storage choice will constrain your ability to evolve and improve your services. Our must successful customers have created pools of protection storage that consolidate their customers and workflows – past, present, and future.
So, why is your protection storage your suit and tie? Because it would have been too cheesy, even for me, to say that the protection storage architecture is bringing this.
Backup is broken. Backups are slow and restores are even slower. Even worse, application administrators assume they’re unprotected because they have no visibility into backup processes. To compensate for these limitations, IT teams deploy point data protection products. Snapshots, clones and replicas can improve backup and recovery performance while giving the data owners more control. However, they can also lead to internal chaos, which can have significant business and IT performance side-effects.
Executive indifference, not data growth, is the biggest challenge for data protection. Most CIOs and IT directors acknowledge that they spend too much money on their backups and that the services they offer are inadequate for the business, but they don’t see it as a strategic priority on the level of cloud, big data or mobile computing. They don’t see the connection between backup and business acceleration or deceleration, whatever the scenario.
Data protection matters to your business. It can accelerate application development, improve productivity and drive revenue… as some successful companies have discovered.
A little more than 10 years ago, I wrote an article for InfoStor magazine exposing IT’s Dirty Little Secret: Backup was grossly inefficient – and IT knew it.
Back then, nobody talked about the backup problem because there was little that could be done about it. Backup teams either lacked the tools to determine if they were backing up everything they were supposed to or they couldn’t abstract the the information they needed with the tools they did have. And, so, discussion in IT shops centered around backup speeds and feeds (of tape devices, mind you)… there was little talk of recovery. Scary.
Five years later, disk-based backup and the increasing adoption of deduplication technology revolutionized the way everyone thought about and did backup. IT conversations shifted from “Is my data protected?” to “How fast can I recover my data in the event I need to?” Discussions focused on improving RTOs, RPOs and continuing to reduce backup windows. All was good.
Today, while much of the conversation in IT shops still centers on backing up and recovering faster and efficiently, there are new concerns about the ability of backup to keep pace. Once again, issues of trust—albeit different ones—are surfacing; only this time the stakes are much, much higher. If you’re thinking in terms of downtime costs. Think again. There’s a direct link between backup and application deployment, productivity (business and IT), innovation and revenue.
At EMC World last week, Guy Churchward, president of EMC Backup Recovery Systems, talked about the new strategic relevance of backup to an organization in a Cube Interview. Churchward compares backup to the plumbing inside a house: “If your house doesn’t have a good infrastructure, it doesn’t matter what the drapes look like,” he says.
So, how’s your plumbing?
In this short four minute video, EMC’s own Gene Maxwell gives a straight forward presentation that answers the question, “Why SourceOne?” SourceOne is a next-generation archiving solution that captures and protects your valuable email, file and SharePoint content for company compliance and legal discovery.