Architecture is sexy. The proof? Brad Pitt loves architecture. Brad Pitt has twice been named “Sexiest Man Alive.” Donald Trump’s garish, obnoxious buildings indicate that he hates architecture. Donald Trump has never been named “Sexiest Man in the Room,” even when he’s alone.
Do you need more proof? Check out the EMC World Backup and Recovery Keynote with Guy “Incredible Hulk” Churchward. Data Protection Architecture is sexy. Well, to some anyway.
In the first installment of this series, I provided the overview of the Protection Storage Architecture. Over the next three installments, I’ll address each core architectural element in more depth.
Protection Storage – Why is it Important?
Why do you need protection storage? Even in the cloud, you can’t store your protection data in thin air. You need media for storing data.
The choice of protection storage is important because hardware evolution drives software architecture and innovation (e.g. multi-core processors led to VMware, NAND memory led to the iPad, etc.). The architectural limitations of traditional backup software stem from designing for tape as the hardware anchor (those constraints are why I criticize tape):
- Slow/Expensive. If you ever plan on recovering data from tape, you must run regular full backups. Unfortunately, full backups overload the primary storage, application server and network.
- Vendor Lock-in. You cannot recover data from tape without the original backup application because it writes data in a proprietary format. Backup software justifies the need to store data differently on tape than on disk because tape is a sequential, offline media (of course, that doesn’t explain why each vendor needs a different format).
- Poor Service Catalog. “Sorry, I can’t set up a separate schedule for you – I’ve got to balance the load on the tape drives.” “Sorry, I can’t eliminate just that sensitive data from the backups; it’s on shared tape.” “Sorry, I need to find an older drive that can read that tape.” The common theme – service in a tape environment is sorry.
Protection storage is the anchor of the architecture, so how do you make the right choice?
Protection Storage – Cost-Optimized, Durable, Multi-Use
In storage, one size does not fit all. The design center for protection storage is different than scale-out NAS, transactional storage or geo-distributed object. The unique requirements for protection storage are:
- Data Durability. Since protection spans disaster recovery, backup, and archival, you need storage that ensures the data will be there – potentially decades later. First, you need the layers of RAID, disk scrubs, checksums, and a file system built to be resilient to errors (to paraphrase our old slogan, “Tape sucks, but disks without the right storage stack aren’t so great, either”). Second, you need storage that has life beyond your current hardware purchase. Like the historical society said, “We’re preserving Abe Lincoln’s original axe – a priceless legacy! That’s why we’ve replaced the handle twice and the head once.” You’ll replace the compute and storage components of your protection storage over the lifespan of the data, but still need to preserve the data. We call this the “Data Invulnerability Architecture.”
- Multi-Use. Organizations cannot afford the capital and operational expense of three separate storage solutions for backup, disaster recovery and archival. Conversely, they cannot afford a “least common denominator” approach that does not meet the RPO/RTO/compliance needs (e.g., everything on tape). Converged protection via versioned replication is the best way scale protection performance with data growth. Therefore, customers need to select protection storage that can support versioned replication to unify backup, disaster recovery, and archival.
- Cost-Optimized. Price matters. The first step toward cost optimization is space optimization – reducing the number of copies (e.g., backup, DR, archive convergence), the overhead of storing multiple copies (deduplication) and the footprint of each copy (compression).The second step is developing software to leverage lower-cost hardware components. This includes scaling performance with CPU and capacity with low-cost, large-capacity storage… while not compromising data durability and space optimization.
A well-designed protection storage platform differs from other types of storage. While some users try to deploy a “good enough” solution with a generic storage platform, it does not meet their evolving needs. Protection has a unique design center: cost-optimized, multi-use, highly durable storage.
Protection Storage – Pools Not Silos
When protection began to affect business, vendors from multiple layers in the stack (e.g., applications, hypervisor, and primary storage) added mechanisms to optimize their cost/protection. Unfortunately, this led to the accidental architecture – each team deploying its own protection solution.
As companies streamline their infrastructures, they focus on eliminating silos. Data protection groups need to consolidate the protection storage silos into protection storage pools, or data lakes. Two key requirements:
- Multiple Data Sources. Protection teams must support data generated by backup application clients, primary applications (e.g., SQL dumps), primary storage (e.g., snapshots, clones, replicas), and hypervisors. The set of protection protocols includes: tape (or VTL), NAS, OST, Data Domain BOOST, iSCSI, FCP and object interfaces. The protection storage needs to support a variety of workloads: traditional file server backup (weekly full, nightly incremental), VMware CBT backup (incremental forever), the assortment of Oracle mechanisms (e.g. RMAN SBT, incremental merge, multiplexing) and more.
- Evolution. The set of protection workflows will continue to increase. While many vendors act as if customers will “rip and replace” their old infrastructure, most protection environments look like a geological dig –multiple layers of active, historical solutions. While switching the backup software can be very difficult (the “vendor lock-in” point), at a minimum you should consolidate the protection storage. Therefore, select protection storage that can handle the variety of workloads in your environment today (mainframe, iSeries, VTL, disk-centric backup, etc.) and evolve to meet tomorrow’s workloads.
While it is unlikely that any company will fully consolidate into one protection storage offering (there’s always a one-off somewhere), the protection team can select an anchor platform that becomes the core of virtually every offering – today and in the future.
Protection Storage – Your Suit and Tie
When designing your protection architecture, the protection storage is your most important choice. Hardware drives software architecture and design, and the wrong storage choice will constrain your ability to evolve and improve your services. Our must successful customers have created pools of protection storage that consolidate their customers and workflows – past, present, and future.
So, why is your protection storage your suit and tie? Because it would have been too cheesy, even for me, to say that the protection storage architecture is bringing this.