Backups are broken…they are you know!

A bit of a statement I know – but for many businesses, it’s true, even if they don’t know it. To be fair, it’s not necessarily the techies fault, or even the fault of the solution, it’s normally broken because it was never really right in the first place…

Why wasn’t it right in the first place?

That’s partly down to people like me and a bit of the fault of the technology.

How this should really work is a business looks at it’s model of operation and looks then at the key systems and data that support it. It then looks at the costs associated with those systems becoming unavailable and makes a judgement on two things;

“How long can i do without that system?” and “How much data am I prepared to lose?”

Most  businesses have had backup solutions in place for many years and of course the problem with that, is historically the IT industry never really asked those questions.

What happened was, us technical folk would look at the data and then we’d look at the solutions, basically these solutions comprised of software that would move data off of primary storage to some secondary and even tertiary storage devices, these devices often would be taken from site (on a tape normally) and stored anywhere from the techies bedroom to some huge warehouse by the side of a motorway!

That sort of worked, it gave you secure backup copies of you data and got them away from your production environment, so in the case of a catastrophic failure you could recover your data to shiny new hardware when it became available.

So far so good you say… However the problem was the technology for doing this, was (is) so slow that straight away it starts to dictate your strategy.

How did that traditional approach let us down?

Our approach was based on a bit of software and a storage device to hold a copy of your data. The software would run a backup job and move data from the production environment to the storage device and your backup was done.

However these backups would often take many hours, because that’s how the technology worked, they had a high impact on the performance of the system been backed up as well, which meant running them during production time which meant systems became slow and unusable, so you’re then limited to running them outside of business hours.

However for many businesses hours have lengthened, which of course means the backup windows have shrunken, this has led to an issue where companies can no longer backup all of their data in the window available and many customers I’ve spoken to end up having to pick and choose which systems are important to backup and deal with the risk of “less critical” data and what happens if that gets lost.

So what we end up with is a solution that dictates the answer to the questions we posed earlier

“How long can i do without that system?” and “How much data am I prepared to lose?”

If we look at an example where we can backup our key data once a night and it takes 8 hours to run (rule of thumb is a restore takes around 3 times as long as a backup) – then the answer to the two questions above is;

I’m prepared to be without a key system and its data for around 24 hours and I’m prepared to lose a complete days worth of data (going back to the previous nights full backup – assuming it works of course!)

That is really the crux of why backup is broken for many organisations, many businesses I meet with are still in a situation where their business recovery strategy is dictated by the technology they deploy and not dictated by the needs of the business.

And if that’s the case for you, then your backup is not only broken it’s pretty much next to useless.

Let’s look at the example again, what if in that backup, they had a system that was hugely critical, the business assessed the risk and came to the conclusion that the business could not afford to be without this main system for more than 2 hours and could not afford to lose more than 15 minutes of data – so if that is the case then for them a solution that delivered the backup and recovery capabilities we outlined above is a complete waste of time.

What’s the answer then?

This is not a sales pitch, don’t worry – It’s safe to say at Gardner we have presented solutions to a whole host of differing businesses with very different recovery needs and there is a whole set of solutions out there to meet the most extreme of data protection and recovery needs, be that NetApp carrying out snapshots and backups at the storage layer, or the new breed of instant copy virtualisation based data protection vendors such as Catalogic or Actifiio, or technologies such as clustering and geographic replication, there are lots of technical ways to address the problem, what I wanted to leave you though was a couple of practical tips on how you can fix your backup.

Technology aside(just go with it exists to solve the problem) the starting point I always have when discussing this with a business exec, is carry out a few simple steps;

1. Identify your key systems and data

Know what’s important – understand what the systems and data repositories are and why they are important and what there loss means to the business.

2. Prioritise them

Understand the most important systems and put them in order of priority – your budget is probably not endless – so define the systems that need protecting first – often you can benefit from the solution that protects them, can probably protect all the other apps to.

3. Understand the impact

See if you can put a cost against the impact of losing the key systems you have defined – this will help in defining a budget for an appropriate solution.

4. Understand your recovery time objective (RTO)

If you lose a system, then based on its importance and the cost of the systems lack of availability, define how long as a business you can “stand” that application to be down.

Also define what you mean by recovery, take email for example, recovery can mean the ability to just send and receive email again and not necessarily have historic emails returned, that can possibly come later.

5. Understand your recovery point objective (RPO)

When the system is back, define how much data you are prepared to lose – think about our example, if you are happy to lose a day, then nightly backups are fine – however if you can only afford to lose 15 minutes – you need to think again!

I think once you have defined these 5 key areas, you can then start looking for an appropriate technical solution to your business problem, firstly look at does the system you have meet your needs and if not  find one that does. And when you’re looking, don’t get lost in the technology, just ask yourself the question, does the solution i’m considering meet by business data protection needs?

Hopefully this helps and you can ensure that your backup and recovery procedures work perfectly and your company data is secure and you’re not exposed because “your backups are broken”.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.