One Lens to the Future: Visibility without Boundaries

You may have recently read The Backup Window’s May 17 post “Forget the Drapes…How’s Your Plumbing?”, in which Heidi Biggar talks about the important relationship between backup architecture and application deployment, productivity, innovation and ultimately revenue.

Also in that article, Heidi shared a video of Guy Churchward, president of EMC’s Backup Recovery Systems division, at EMC World last month. In this video, Guy compares backup to the plumbing of a house – without backup, it doesn’t matter what the rest of your environment looks like because you won’t be able to scale to address exponential growth due to big data.

I’m going to take that argument one step further and tell you that while having a good backup and recovery infrastructure (a.k.a. the plumbing) IS important, effective management of that infrastructure may require you to mask it.  Let me explain.

Modern, unified, non-disruptive data protection infrastructures are complex, though you might not have all the components shown here in play today.  It really depends on what the business need actually is.

Data Protection Advisor

Starting on the left side we see some virtualized hosts with applications, some physical hosts and primary storage. You may have some particularly challenging mission-critical applications with aggressive RTOs and RPOs. You may be using replication for those.  But, all of this needs to be backed up and protected.  You’re likely using your backup manager of choice, which may also be backing up your VMs directly.  And eventually those safe sets of data are going to make their way down to an archive device.

That’s the infrastructure Guy spoke of – it’s important and it performs a vital task for your business.  However, as I mentioned, it’s complex. It’s simply not possible to effectively monitor each data protection component individually, particularly if there are multiple backup applications or many archive devices.  Visibility is crucial, and in order to get a holistic, end-to-end view of the environment you need to mask the complexity.  That’s where data protection management software like Data Protection Advisor can help.

Case in point:  in speaking with many customers over the past few years we’ve learned that the SLAs they were being asked to meet as part of their organizations’ transformation processes weren’t focused on the individual success of an individual backup (i.e., they didn’t care whether a backup occurred on the first, second or nth time of asking) but rather the speed and precision of the overall process. Customers really wanted to know that their data was being protected within specified time periods and that it had reached the designated vaulting location/device.

And to be able to do this, you need to be able to see and manage the entire environment.

Abstracting Management as a Change-Enabler

There is another important capability these tools bring. By separating the management view of the protection infrastructure from the various technologies deployed, IT is empowered to implement operative changes to the environment.  (I’ll explain this too.)

Service providers and enterprise IT shops alike are looking for ways to beat out the competition by investing in new technologies that will help differentiate in terms of cost or performance. But by swapping one technology for another, management and visibility of the protection environment are lost, or at least broken. Each new technology brings its own variation on ‘how things should be done’.

However, by abstracting management views of the entire environment away from the underlying technology, the service provider’s management view and control of end-to-end protection processes are buffered from any change in the data protection ‘plumbing’.  These management tools become a change enabler (or transformation enabler) by simplifying the environment and removing the worry and hassle that often accompany transformation.  In other words, your management tools can become a change enabler independent from your underlying data protection technology.

Somewhat related to this is EMC’s recent announcement of ViPR Software-Defined Storage. You’ve probably heard how ViPR can Virtualize Everything. Compromise Nothing.

ViPR provides a revolutionary approach to storage automation and management to transform existing heterogeneous physical storage into a simple, extensible and open virtual storage platform. This means that organizations don’t have to give up choice as their organizations grow and management costs don’t have to go through the roof either.

With ViPR, organizations get a simple, unified way to manage virtual and physical storage that not only protects their investments today, but can also dynamically adapt and respond to future requirements.

While DPA isn’t quite the same as ViPR, and ViPR is intended for primary storage, the underlying goal is the same: simplify complexity through automation and centralized management.

And that gives you the freedom of choice and the flexibility to select the plumbing components you need to drive your transformation.

Tom Giuliano

Tom Giuliano

Marketer and EMC Data Protection Advisor Expert
I love to listen to customers discuss their data protection challenges, their experiences and their needs, and I’ve had a lot of opportunity to do it. For the past 15 years, I’ve brought network and storage products to market through roles in sales, product management and marketing. When I’m not driving go-to-market initiatives, identifying unique and creative methods to build product awareness or launching products, you’ll likely find me cycling, skiing, boating or running. And, who knows, maybe you’ll hear some of my more interesting experiences in one of my posts from time to time.

The Right Architecture Is Priceless, Part III

Data sources slide for TBW

“Architecture should speak of its time and place, but yearn for timelessness,” Frank Gehry.

During the EMC Backup Recovery Systems’ keynote at EMC World, Guy “Haybale” Churchward shared his perspective as a British homeowner. His house was built 150+ years ago, and it will stand for another 150+ years. Therefore, while he makes it his home right now, he feels a responsibility to improve it for the next owner (check out his recent blog post). The home ties together people who will never meet. The right architecture, from St Paul’s in London to Hagia Sophia in Istanbul to Guy’s house, can both connect and inspire across generations.

In this series, I introduced the Protection Storage Architecture and explored the Protection Storage component. This time – Data Source Integration. (To start the series at the beginning, click here.)

Data Source Integration – Why Does it Matter?

Performance and visibility. When they are missing, users lose confidence in the protection team. They slow their development. They roll their own solutions. They lose data.

Performance and visibility. That’s how the protection team can drive the business. Faster backups and restores minimize data loss and downtime, reduce management complexity, and increase the likelihood of data recovery. With visibility into the data protection, application teams and end users gain confidence, accelerate innovation, and remain safe.

Performance and visibility. How can the protection team deliver? Data source integration. Each team believes its data source – the application, the hypervisor, the storage array or the server – “owns” the data (in a virtualized world, multiple teams claim data ownership, until things go wrong; then, all of a sudden, it’s the backup team’s data). The data source touches every bit of information that its users generate or access; its management interface provides administrative control.  By sitting in the data path, the data source can optimize protection performance. By incorporating protection controls into its UI, the data source can provide visibility to the data owners in their preferred interfaces.

Data source integration delivers the protection performance and visibility that organizations need.

Data Source Integration – Performance

Data sources optimize protection performance compared to traditional backup clients because they sit in the data path.

A standard backup agent works very hard, but not very smart. The agent sits idle until backup time, when it wakes up and looks for new data to protect (I’m assuming you’re running incremental forever versioned replication– if you’re still running frequent fulls, this discussion may feel like you’re sitting in a Peugeot 306, watching the TGV train thunder by). Backup agents look at every file in the data set, checking timestamps to detect whether it has been modified. Yes, the agents look at every … single … file. Once it locates a new or modified file, modern agents then checksum the data to identify the new data within that file (a critical optimization for protecting large files or using a low-bandwidth network). Backup clients run the storage equivalent of a search for needles in haystacks. While this approach is far better than running a full backup (maybe you’re sitting in a Ford Aspire watching South Korea’s KTX2 train zoom past), but customers continue to reach traditional backup clients’ scalability limits.

The data source, on the other hand, can track exactly what data needs to be protected. Whether it is the application, the hypervisor, the storage, or the server, it owns the data. The data source executes the users’ every data creation, modification, and deletion, so it can keep a log, a journal, or a bitmap of those changes. Therefore, at backup time, the data source already knows exactly what to protect. There is no need to look at every file, no need to checksum every chunk. Instead of searching for needles in a haystack, the data source hands the backup process a pre-ordered set of needles. Even better, when it comes time to restore, it can ask for just those needles back!

Some of the leading vendors that can optimize backups via tracking changes include: VMware (Changed Block Tracking), Oracle (Block Change Tracking), EMC (RecoverPoint, TimeFinder Clones, SyncIQ, …), NetApp (SnapDiff), and Microsoft (Filter Drivers and Change Journal). In other words, the options are widespread.

Because they sit in the data path and can track the new and modified data, integration with data sources can reduce backup and recovery time from days and hours to minutes or seconds.

Data Source Integration – Visibility

Data sources optimize protection visibility by connecting to users via their preferred interfaces.

Technology developers define ‘simple’ differently from the rest of us. Take EMC’s “very simple” goal management system. Every quarter, I must approve my employees’ MBOs in this application. While it has a well-designed UI and management flow, you can guess what I’m doing 5 minutes before close-of-business on the MBO deadline. I’m screaming at my computer about the incomprehensibility of the system, the pointlessness of MBOs and the series of Palahniuk-level horrors I want to visit upon HR, IT and the application developers. When you login to an interface once a quarter, no matter how simple, you re-learn it each time. If I could approve via my normal tools – email, bug tracking system or source code repository – MBOs would take under a minute. Now that’s simple!

Regardless of how simple, elegant, or fun… another interface adds complexity, especially when the customer rarely uses the interface. End-users and administrators do not want to log into a backup application interface. They want to see and manage their protection from their primary tool – vSphere, Oracle, SAP, Unisphere, NFS/CIFS share, etc. If their application does not support a protection view, the backup vendor should provide an interface with the same look and feel as their common tool. Only then will they feel comfortable with the protection environment.

Not surprisingly, the same data source vendors who are optimizing the protection data path are also enhancing the protection control path.

Because they are the users’ central interface, integration with the data sources can improve visibility and confidence into the protection environment.

Data Source Integration – Proof that Data Protection Matters

Performance and visibility have driven the decade-long renaissance of protection innovation. The industry’s data source titans understand that data protection matters. Oracle, Microsoft, VMware, NetApp, and EMC have optimized protection performance and visibility. Ten years ago, these vendors would have said, “That’s a backup software problem” or “Upgrade your hardware to get better backup performance” because companies do not spend resources solving “somebody else’s problem”.

Today, they invest because protection has become their problem. Protection is the primary inhibitor to the growth of their big data applications and infrastructure. As the data sources, they have both the incentive and the unique ability to help solve the problem. Their investment to deliver solution demonstrates the importance of data protection to your environment.

Therefore, as you design your protection environment for today and the future, data source integration is a critical component of your architecture. Protection has become integral to the data sources, so the data sources must be integrated into your protection architecture.

Stephen Manley

Stephen Manley

CTO, Data Protection and Availability Division
Over the past 15 years at both EMC and NetApp, I have traveled the world, helping solve backup and recovery challenges - one customer at a time (clearly, I need to optimize my travel arrangements!). My professional mission is to transform data protection so that it accelerates customers’ businesses. I have a passion for helping engineers pursue technical career path(without becoming managers), telling stories about life on the road and NDMP (yes, that’s NDMP).

It’s Time for Backup and Archive to Come Home

95716512

One thing I do when traveling on a plane is spend time just contemplating. With sleep deprivation and jet lag, my mind wanders in rather abstract ways. My favorite artist is surrealist Salvador Dali so perhaps I have heightened activity in this part of my brain.

On a recent trip, I started thinking about an old house of mine back in the UK. The house was built in the early 1800′s in North Yorkshire. It was originally designed as a clothing workshop but was converted to a residential property within 40 years. I wasn’t the first owner and I certainly won’t be the last, which led me to think, what’s the real difference between renting and owning a home? Even if you own a house, you are still just a tenant. Eventually, someone else will move in and call your house, home.

The first thing we do when we ‘own’ a house is make it OUR home. We upgrade it for our comfort but also with an eye on generating a return on the initial investment. Whether we sell the property or hand it down to loved ones, we hope that we made a smart buying choice and added upgrades that leave a good legacy. Not a gift that only provides pain for years to come.

I would contend you should think about your backup and archive exactly the same way you would think about your house. You are the tenant of your company and you have the fiduciary responsibility to make the smartest choice possible because bad stuff happens and we need a plan for the unforeseen.

So what kind of tenant are you? How many major decisions has your company made in the backup and archive space? Have you made decisions for your comfort or for the future? Do you need more than one hand to count them? Is this a good or a bad thing?

At EMC Backup Recovery Systems, we strive to deliver value today and flexibility for tomorrow. Buying into EMC’s Backup and Archive Portfolio isn’t just solving an immediate temporal need but laying down a lasting platform that has the ability to evolve and resolve challenges that your company might encounter tomorrow. The recently announced EMC Data Protection Suite enables our customers to deploy a variety of data protection software solutions, without buying a new point product each time a new challenge arises.

During my time at EMC, I have found the commitment to both the present and future to be deeply rooted in EMC’s DNA. In the backup and archive space, this approach is critical and it’s why we continue to invest so much in giving choice and best of breed… regardless of which solution you choose. This philosophy has led to our prolific client base and why, by any measure, we’re the largest backup company in the world. Let’s be honest, if you don’t invest in good plumbing then it doesn’t matter if you have nice drapes.

 

Guy Churchward

Guy Churchward

President, Data Protection and Availability Division
I'm an enterprise infrastructure hack. Really, if you think of my career as a building, I’ve spent it underneath in the sewer lines and the electric plumbing, making sure things work. Invariably, my businesses end up being called boring. But that’s okay. It means they’re doing exactly what they’re supposed to do, which means their customers can do what they need to do. I come to EMC by way of BEA Systems, NetApp and most recently LogLogic, and my mission is to lead EMC Data Protection and Availability Division's efforts to deliver a protection storage architecture that leaves us all in better shape for the next guy, or gig, that comes along. Oh, and make no mistake about it, I want everyone to know who’s number one in backup, and why.

Are Deleted Emails Really “Gone”?

There’s an interesting new story from Ontario, where an ex-government staffer is being charged with the wholesale deletion of email, apparently in violation of provincial law.  The ex-staffer claimed he was just keeping his inbox clean and believed that all email was backed up and could be restored as needed.  That turned out to not be the case because government IT services did not have an archive and retained email server backups for just 24 hours.

A few interesting thoughts about this story:

First, if your organization is charged with retaining record content — as are many government agencies, particularly in the US, both for public records and on an as-needed basis for litigation – there is a great deal of risk in leaving that responsibility solely to employees.  Setting up an archive to enable and enforce the organization’s retention policy is the best practice, especially for email content.

Second, although the article seems to imply that the deleted emails are lost forever, that’s not necessarily the case.  But it won’t be simple (or inexpensive) to find them if that becomes a requirement, as it might in the US under applicable law or in an eDiscovery case.  Let’s first assume that the messages really are wiped from the ex-staffer’s mailbox on the server and his computer.  Some messages will still be stored with some of the other 90,000 governmental employees.  They will have messages that he sent to them, along with messages they sent to him or where he was a co-recipient.  Their copies of the messages might be on the email server, but if PSTs are permitted, they could be stored on laptops, desktops, USB memory sticks, file systems or even backup tapes (a PST is a local cache of messages enabled by Microsoft Exchange, the email system in this case).  That’s a lot of places to look — and more technical readers can think of a host of other locations.

Finally, although I’m an advocate of short retention cycles for backup media, a one day retention for email backup seems very aggressive and potentially risky.  I would first question whether that policy is really being followed, or if possibly full backups are being made (and retained) on some other basis.

So while it may be possible to get rid of most email — if you understand the systems at issue and are motivated — there are always a number of places to look.  The only limitations are time and money.

 

Microsoft TechEd 2013: An EMC Backup Perspective

Another Microsoft TechEd gone by, and speaking for the EMC Backup and Recovery Team on-site, we are finding this to be one of our best conferences of the year!  In between playing with (and trying to learn how to use) our new Surface tablets we got at the show, we also had some really good conversations with you, the attendees, on the issues you are facing with backup and recovery in a Microsoft cloud ecosystem. Take a look at this video for some visual highlights from the show, along with our take on what attendees were talking to us about regarding backup and recovery solutions for Microsoft.

See you in Houston next year for TechEd 2014!

Alex Almeida

Alex Almeida

Technology Evangelist, Data Protection and Availability Division
My passion for technology started at an early age and has never stopped. Today, I find myself immersed in data protection. Yep, I live, breathe and tweet backup, availability and archive. In fact, nothing short of fully understanding how things work will keep me from digging deeper. But when I’m not evangelizing on the benefits of backup or technology in general, I can be spotted at a New England Revolution game, behind the lens of a camera or listening to my favorite albums on vinyl. In addition to blogging for The Protection Continuum, you can find me on the EMC Community Network. Also, I'm a member of EMC Elect 2014, and I'm active in the New England VMware User Group (NEVMUG) and the Virtualization Technology User Group (VTUG). Let's get technical!