Continuous data protection (CDP) backup works to back up all of the data in a system whenever a change is made. So if your data backup system is hit with a virus or data loss occurs, you can go back to the most recent clean copy of the data and restore it. There are two types of continuous data protection: real-CDP and near-CDP. What are the differences between these? Can we consider continuous data protection a replacement for traditional backup systems? And who should consider using CDP in their enterprise? W. Curtis Preston, independent backup expert and executive editor for TechTarget, answers these questions and more in this Q&A.
Can you outline the differences between real-CDP and near-CDP?
The term continuous data protection really only applies to what some people call real-CDP. Real-CDP is basically data replication where the changes that occur on the system that is being protected are immediately replicated to another system. Replication and CDP are different because of what happens on the destination side. With real-CDP, there is something that’s continuously being updated on the destination site. In addition, real-CDP also stores a log of these changes, allowing the system to roll those changes back to a previous point in time. Essentially, with real-CDP you’re able to undo anything that happens to a system within the period of time that you’re storing data to it.
People using real-CDP typically store every change in their system for a few days. Then they create points in time which are treated like snapshots. So the changes that happen between these points in time are discarded. For example, say a person who’s using real-CDP stores every change that happens to their data system for three days in hourly snapshots. Then maybe after a week or a month you go to storing daily changes, and so on.
Near-CDP only documents changes to the data from specific points in time. Near-CDP is not an official term, but rather, it’s a combination of snapshots and data replication. Snapshots by themselves are not a good data protection mechanism because they rely on the system being protected. Replication by itself is not a good data protection mechanism because if you do something stupid like delete a table or get a virus it simply makes the virus more effective because it replicates the virus over to your replication system. However, when you combine the two technologies together, you get a good data protection mechanism, and many people call this near-CDP. Essentially, with near-CDP technology I would take a snapshot, say ever hour, and when I take that snapshot then I can replicate those snapshots over to another destination.