Goodbye Tape, Hello Data Protection as a Service

Over the weekend I was talking with some friends about their experience attempting to buy a home in Silicon Valley. From what I gathered, the housing market is heating up and it appears to be a sellers market yet again. This made me think about an EMC customer, Healthcare Realty Trust.
Continue reading

Lisa Matzdorff

Lisa Matzdorff

Voice of Customer, Data Protection and Availability Division
I have a passion for listening, more specifically, listening to customers share their IT stories, their experiences, their successes! Over the past 7 years in the role of customer reference manager and customer advocacy manager, I’ve had the pleasure of listen to amazing stories and meeting some very interesting people. The one thing that makes my job even better…I get to share those stories. When I’m not working, I’m volunteering with foster children, running 5K fun runs, playing fashion consultant “What Not To Wear” style, traveling, and watching reality t.v

The Asian Pressure Cooker

The pressure on Asian businesses has never been greater due to fluctuating economies, greater global competition, massive data growth, and the need to comply with complex local and international laws and regulations. Many Asian organizations have long relied on traditional infrastructure such as cheap disk, and tape to manage and protect their data; however, these approaches no longer work due to long backup times, unreliable restorations and ballooning costs.


Source: IDC, 2013.

In a recent study, IDC analyzed a number of companies in Asia Pacific, China and Japan and concluded that unless organizations transform their data protection environments, they will lack the agility and efficiency to compete both locally and internationally. The paper focuses on a sample of companies across Asia, representing a broad range of industries and sizes.

One common finding across all companies was the dramatic reduction in costs and risks by implementing EMC’s Data Protection solutions.  In fact, the average savings were $2.6m annually, as summarized in the chart below.

The paper highlights that data protection challenges are just as much a business and trust issue as they are an IT operational problem, and the sooner that companies address these challenges, the sooner they will be out of the pressure cooker and sitting at the dinner table with the Business Managers.

Shane Moore
I have been in the IT industry for close to 20 years and started my career as an Officer in the Australian Air Force. For my first posting, I had a choice to either manage a national network of servers or run a warehouse (the physical kind). Thankfully, I chose the former and subsequently managed infrastructure in a number of public and private organizations. Later, I started selling and then marketing IT solutions for Computer Associates and now EMC. I have a passion for technology and I am excited by the way it continues to transform our lives. In my current role, I work across Asia promoting EMC’s data protection solutions, spending time with analysts and writing articles for traditional and social media. In my spare time, I provide IT support for my family and enjoy the outdoors. For the record, Top Gun is my favorite movie of all time!

Cut the Tape, Defrost Your Mainframe


You might be wondering: Brain freeze and mainframe – what’s the connection?

The brain part should be obvious. Mainframes process the most critical transactions and store the most important data to many organizations. Therefore, the mainframe is the “brains” of key operations.

The not so obvious part is why your mainframe is not delivering to its fullest potential – suffering from “brain freeze.” The reason is simple: tape.

To read more of the post on our sister site Thought Feast, click here.


Lady Backup
Lady Backup’s career in IT dates back before the time of the Spice Girls. Initially I started in high tech journalism in the US and eventually transitioned to become an industry analyst. My analyst years also coincided with my education – during this period of my life I was working on my MBA. After 7 years of going to school at night, I graduated with distinction with an Information Age MBA degree from Bentley University (at the time it was still Bentley College) located just outside of Boston. With degree in hand, what’s a restless girl to do next? This is where networking with fellow classmates led to a job at EMC. Starting our Hopkinton headquarters, I moved outside of the US with EMC International when I felt it was time for my next change. Today, Lady Backup is an American on the loose in the world. Living outside the United States has been a fascinating experience. For the moment I call England home. But I’m feeling my next wave of restlessness coming. Here are two hints: I love sunshine and I’m improving my Spanish.

Tape Is Dead, Part II

How should I back up data that doesn’t deduplicate? It’s one of the questions I’m asked often – by both our engineers and our customers. In fact, a TBW reader raised the issue in response to my recent post. Therefore, I’d like to explain how we approach such fundamental challenges and then share the approaches that I recommend to our customers. 

The Fundamental Challenge
Difficult challenges require a system-level solution approach because the problems are too complex to be solved by one component. It is this systems view that drives my push to transition from tape to disk.

Over the past twenty years, tape-centric backup systems have evolved about as far as they can. Meanwhile, disk-centric backup continues to evolve rapidly because disk storage systems alter the constraints in the system. Therefore, “backup to disk” isn’t code for “write a tar image to a Data Domain VTL” (especially since VTL still implies a tape-centric backup approach).

Usually, one of the disk backup approaches can meet our customers’ RPO/RTO and reliability needs at the right cost… or come closer to the mark than anything else available. More importantly, with both the freedom and investment to innovate, disk-centric backup architecture will more effectively address IT challenges today and in the future.

The Approach: Four Use Cases
There are four “non-dedupe” backup use cases I hear about:

  1. Low-retention, non-repeating data (e.g., database logs): Customers usually choose between two options: Option 1: Store the logs on the backup appliance, getting only local compression, but with consolidated protection storage management.  Option 2: Store the logs on non-deduplicating disk systems and coordinate the storage management (e.g., replication). Regardless, disk is usually the best option to handle the performance requirements for high value data with such an aggressive half-life.
  2. High churn environments (e.g., test data): These data sets experience 30%+ daily change. Most customers opt for short-term retention because the data is so short-lived. In that case, I recommend snapshots/clones and/or replication. While the snapshots consume a significant amount of space, they save a tremendous amount of IOPs. Too often, organizations ignore the heavy I/O load caused by backups. Not only are most of the backup reads not served from cache, but they often pollute the cache.  In high-churn environments, IOPs are even more precious, since the storage system’s disks are so heavily loaded with the application load (and the churn makes flash a non-ideal fit). Therefore, at a system level, it is often less expensive to consume extra space for snapshots than to consume the IOPs for traditional backups.

    As an additional benefit, the snapshots enable faster recovery from current versions of data. The choice to replicate becomes a cost/benefit analysis around the availability of data vs. the cost of a second storage array and network bandwidth. Tape-centric approaches compromise application performance (or require overbuying the primary storage performance), recover stale copies of the data, and recover the data so slowly that customers prefer to regenerate the data (e.g,. application binaries, satellite images, oil and gas analytics, or rendered movie scenes).

  3. Environments in which you don’t run multiple full backups and have little cross-backup dedupe (e.g., images, web objects, training videos): If data is never modified and rarely deleted, customers don’t run full backups. Since a backup appliance derives much of its space savings from deduplicating redundant full backups, dedupe rates fall in the absence of multiple fulls. The best approach for protecting these data sets is replication, especially if the replicated copy can service customer accesses.

    Since the data is not modified, there is little value from retaining multiple point-in-time copies. Therefore, the most critical recovery path is that of a full recovery; nothing is faster than connecting to a live replica, nothing is scarier than depending on multiple incremental tape restores. Furthermore, these types of datasets tend to have distributed access patterns, so technologies like EMC’s VPLEX can improve both protection and performance with the same copy (another way of deduplicating copies).

  4. Environments in which the application behavior compromises dedupe (e.g., compressing data that you modify): Think of an application that either modifies compressed files in place (e.g., open file, decompress file, modify file, recompress file) or creates multiple compressed copies of data (e.g., compressed or encrypted local database dumps). This workflow tends to create 10x more data modification than the actual new data.

    In these cases, you have two options:  Option 1: Decompress the data for the backup and/or write the database dumps directly to the dedupe storage, so you can get the optimal deduplication. Option 2: Treat the data as Type 1 or Type 2 discussed above.

    However, if the customer is unwilling to decompress the data and wants long-term retention, this is the most plausible instance in which to leverage tape. I’m just not sure it’s widespread enough to justify deploying a tape environment; I would fully explore cloud options first.

When I advocate for disk, I’m asking the industry to both consider at the entire portfolio of disk solutions and the possibilities that can be developed. As we’ve been discussing on LinkedIn, as soon as you make disk your design center, it opens a whole new set of architectural approaches. And that’s the transition that is so exciting – moving from putting disk inside a tape-centric architecture to really designing around disk.

As you can see from the examples above, the most challenging environments for data protection require a system-level approach. In fact, some of them demand approaches that look beyond just the protection infrastructure. As we’ve talked about in the past, backup teams need to connect with application, virtualization, and storage owners to provide the services that their users need. With those connections, they can deliver better integrated, more innovative solutions to their customers.


Stephen Manley

Stephen Manley

CTO, Data Protection and Availability Division
Over the past 15 years at both EMC and NetApp, I have traveled the world, helping solve backup and recovery challenges - one customer at a time (clearly, I need to optimize my travel arrangements!). My professional mission is to transform data protection so that it accelerates customers’ businesses. I have a passion for helping engineers pursue technical career path(without becoming managers), telling stories about life on the road and NDMP (yes, that’s NDMP).

Tape is Alive? Inconceivable!

To begin each year, Joe Tucci brings 400+ people together for the EMC Leadership Meeting. We spend a little time reflecting on the prior year, but most of it focusing on the future. After that, the Backup and Recovery Systems Division leadership spends another day planning our future. So, imagine my surprise when I saw, on the Backup and Recovery Professionals Group on LinkedIn, a thoughtful discussion about the role of tape in the backup environment. I’ve just spent a week discussing cloud, big data, and the evolution of data protection… and we’re still talking about tape? Inconceivable!

While I appreciate both the maturity of the discussion and the resiliency of tape, it’s a waste of time. Every moment spent talking about tape is a moment not spent discussing the future of data protection – deduplication, snapshots, versioned replication, cloud, ???. The opportunity cost of discussing tape frustrates me.

 Tape is not the answer of the future. It’s increasingly less useful in the present – unless you’re talking about data that you don’t ever intend to actually access again. Here’s the reasoning:

  • Full recovery from a complete server or storage array outage: As capacity increases, the only way to recover the data quickly enough to be useful is to have it online and spinning somewhere (e.g., replication). The issue here isn’t so much disk vs. tape as it is tape-centric backup architectures. If you need to wait until all of the data is restored to a new system (and writing data on the recovering system is usually the bottleneck), you’ve been down too long. Tape doesn’t hit the bar here.
  • Rollback from corruption: If most of the data is still good, but there’s been some corruption (user or system caused), the only way to recover quickly is some sort of changed block rollback (e.g., snapshot/clone rollback, changed block recovery for VMs, etc.). In general tape-centric backup architectures make rollbacks near-impossible.
  • Granular recovery: When it comes to granular recovery, it’s all about getting the right version of your data. In this case, recovery is all about backup – when you can do backup more frequently and store more copies (space-efficiently, of course), you’re more likely to get the version of the data you want. In general, disk-centric architectures that leverage some sort of data optimization (e.g., dedupe, snapshots, clones) enable you to keep more and more frequent backups.
  • Archival recovery: Traditionally, this has been where tape has made its arguments around relevance – long-term, cost-effective, low-power retention. But here’s the problem. In general, we’ve all agreed that backup is non-optimal for data archival. It’s rare that you can track the lifecycle of data (e.g., ‘I want to recover a file from server X, from 12 years ago. Does anybody remember what server the file was on 12 years ago?’), you’re unlikely to have the infrastructure to access it (e.g., ‘Does anybody have a DEC server with application X, version Y?’), and even less likely to manage the tape infrastructure lifecycle to enable the data recovery. As I’ve seen customers go tapeless at multiple companies (as I’ve worked at multiple vendors), they use the transition to disk to re-examine and reduce their retention periods, and deploy a true archival solution.

I think one customer put it best: “I’m legally required to store data for 30 years, but I’m not required by law or business to ever recover it. That data is perfect for tape.”

Do you think we need to spend more time talking about tape? Do you think tape has a bigger role to play today or in the future? If you had new money to spend, would you put it on tape? Am I being overly dismissive? Please weigh in here or on LinkedIn – Backup & Recovery Professionals Group.


Stephen Manley

Stephen Manley

CTO, Data Protection and Availability Division
Over the past 15 years at both EMC and NetApp, I have traveled the world, helping solve backup and recovery challenges - one customer at a time (clearly, I need to optimize my travel arrangements!). My professional mission is to transform data protection so that it accelerates customers’ businesses. I have a passion for helping engineers pursue technical career path(without becoming managers), telling stories about life on the road and NDMP (yes, that’s NDMP).