Do you remember pinhole projectors? When a solar eclipse approached, a teacher would warn you, “Don’t look directly at the sun or you’ll burn your retinas.” Instead, you’d build a pinhole projector by poking a small hole in a long box. During the eclipse, you’d see a projection of the eclipse inside.
As with a solar eclipse, sometimes the best way to view massive industry shifts is to look at the projection onto adjacent industries. As we move from VMworld to Oracle OpenWorld, it’s an ideal time to look at how hypervisor and application vendors learned to start worrying and love data protection.
Why Care About Data Protection Now?
From the early 1990s to the mid-2000s, backup architectures didn’t change. In response to a service request, the backup team would install an agent on the application server. That agent would then read the data, package it into a proprietary data format, and sends it to the backup target. Recoveries would run in the exact reverse. For the past 20 years, backup software vendors have tweaked that architecture (e.g., incremental/differential backups, elimination of media servers, source dedupe), but the fundamentals haven’t changed. Backup agents still sift through huge pools of data, pack the information into vendor-proprietary formats, and send the resultant blob of data into the dedicated backup infrastructure.
Meanwhile, for nearly 15 years, the data sources (e.g. ,applications, hypervisors) didn’t care about backup. In 2002, I met a backup administrator struggling with backup and recovery windows for his biggest databases. He asked the sales representative from that database vendor how to improve backup performance. The answer, “Upgrade your hardware. Faster server. Faster network. Faster tape drive. Not my problem.”
Then, everything changed. As with any big shift, it’s hard to pin down one cause (more data) because everybody has different drivers (lots more data), and there are so many factors (dear God, there is so much data). As backup and recovery times increased, customers worried about scaling databases and virtualizing business critical application. When backup threatened their revenue streams, the vendors who had ignored backup, suddenly cared very deeply.
Data Protection – Not Just for Backup Vendors
As the data sources (e.g., applications and hypervisors) began to care about backup they noticed:
- They could make backup and recovery fast. By sitting in the data path, they can track what needs to be protected and recovered. They see every new piece of data as it is written, modified, or deleted. The backup client has no such option; at backup time, it must search through all the data looking for what’s new. While the backup agent wastes hours searching for needles in haystacks, the data source can enable protection to be completed in minutes.
- They could simplify backup and recovery for their users. When I have a problem, I like to solve it quickly, myself. If I need to learn a new tool, I get cranky. If I need to call or email somebody, I get ornery. When DBAs call the backup team to recover their data, they get downright cantankerous! The application and hypervisor vendors understood that they could add a simple interface to allow their customers to run 99% of their own recoveries.
Of course, any company can dramatically improve a variety of functions. The bigger question is: What innovation delivers the best return on investment (ROI) for the significant investment in innovation? And this is when you can see that transformation of the backup industry extends well beyond the backup vendors:
- VMware: With their investment in vStorage APIs for Data Protection (VADP), Changed Block Tracking, vSphere Data Protection, and Site Recovery Manager – VMware has invested heavily in both the data and control paths for data protection.
- Oracle: With their investment in Oracle Recovery Management (RMAN), Block Change Tracking, and Incremental Merge, Oracle has invested heavily in both the data and control paths for data protection.
I selected these vendors because we’re book-ending their conferences, but Teradata, SAP, Microsoft’s back-office applications, and many more, have invested heavily in data protection. To repeat – data protection has been a top priority for applications with little/no stake in data center infrastructure, limited monetization of protection, and huge opportunity cost vs. enhancing core functionality. Data protection is changing.
What Does This Mean for the Backup Industry?
When you stare at one fairly stagnant item long enough (e.g., grass), even small changes (e.g. grass growing) can seem revolutionary. Sometimes it’s best to look around and re-calibrate your internal measurement of what is transformative. That is certainly true for the backup market.
As the data sources take a more active role in protecting their data, traditional backup architectures won’t survive. Slow agents, proprietary data formats, and dedicated backup infrastructure are relics of the past. Products with those legacy architectures will struggle to adapt to the new world.
However, while the architecture will change, companies still need the value provided by backup teams: managing the protection storage infrastructure, driving policy compliance, cataloging the data, and reporting across the environment. The protection team will continue to deliver this value, just with a more flexible, open architecture.
The backup market is truly transforming, but if you just look at the backup software vendors, you’ll miss the chance to lead. You can best measure the change by looking at the application and hypervisor vendors. Then, you can transform your sense of urgency into selecting a solution. But, I warn you, don’t look directly at the legacy backup software vendors. Watching old stars go supernova and explode can burn your eyes out.