Data Deduplication FAQ

The SNIA Networking Storage Forum (NSF) recently took on the topics surrounding data reduction with a 3-part webcast series that covered Data Reduction Basics, Data Compression and Data Deduplication. If you missed any of them, they are all available on-demand.

In Not Again! Data Deduplication for Storage Systems” our SNIA experts discussed how to reduce the number of copies of data that get stored, mirrored, or backed up. Attendees asked some interesting questions during the live event and here are answers to them all.

Q. Why do we use the term rehydration for deduplication?  I believe the use of the term rehydration when associated with deduplication is misleading. Rehydration is the activity of bringing something back to its original content/size as in compression. With deduplication the action is more aligned with a scatter/gather I/O profile and this does not require rehydration.

Read More

Data Compression Q&A

Everyone is looking to squeeze more efficiency from storage. That’s why the

SNIA Networking Storage Forum hosted a live webcast last month “Compression: Putting the Squeeze on Storage.” The audience asked many great questions on compression techniques. Here are answers from our expert presenters, John Kim and Brian Will:

Q. When multiple unrelated entities are likely to compress the data, how do they understand that the data is already compressed and so skip the compression?

A. Often they can tell from the file extension or header that the file has already been compressed. Otherwise each entity that wants to compress the data will try to compress it and then discard the results if it makes the file larger (because it was already compressed). 

Read More

An FAQ on Data Reduction Fundamentals

There’s a fair amount of confusion when it comes to data reduction terminology and techniques. That’s why the SNIA Networking Storage Forum (NSF) hosted a live webcast, “Everything You Wanted to Know About Storage But Were Too Proud to Ask: Data Reduction.”  It was a 101-level lesson on the fundamentals of data reduction, which can be performed in different places and at different stages of the data lifecycle. The goal was to clear up confusion around different data reduction and data compression techniques and set the stage for deeper dive webcasts on this topic (see the end of this blog for info on those).

As promised during the webcast, here are answers to the questions we didn’t have time to address during the live event.

Q. Does block level compression have any direct advantage over file level compression?

Read More

Compression Puts the Squeeze on Storage

Everyone knows data volumes are exploding faster than IT budgets. And customers are increasingly moving to flash storage, which is faster and easier to use than hard drives, but still more expensive. To cope with this conundrum and squeeze more efficiency from storage, storage vendors and customers can turn to data reduction techniques such as compression, deduplication, thin provisioning and snapshots.

On September 2, 2020, the SNIA Networking Storage Forum will specifically focus on data compression in our live webcast, “Compression: Putting the Squeeze on Storage.” Compression can be done at different times, at different stages in the storage process, and using different techniques. We’ll discuss:

Read More

Data Reduction: Don’t Be Too Proud to Ask

It’s back! Our SNIA Networking Storage Forum (NSF) webcast series “Everything You Wanted to Know About Storage but Were Too Proud to Ask” will return on August 18, 2020. After a little hiatus, we are going to tackle the topic of data reduction.

Everyone knows data volumes are growing rapidly (25-35% per year according to many analysts), far faster than IT budgets, which are constrained to flat or minimal annual growth rates. One of the drivers of such rapid data growth is storing multiple copies of the same data. Developers copy data for testing and analysis. Users email and store multiple copies of the same files. Administrators typically back up the same data over and over, often with minimal to no changes.

To avoid a budget crisis and paying more than once to store the same data, storage vendors and customers can use data reduction techniques such as deduplication, compression, thin provisioning, clones, and snapshots. 

On August 18th, our live webcast “Everything You Wanted to Know about Storage but Were Too Proud to Ask – Part Onyx” will focus on the fundamentals of data reduction, which can be performed in different places and at different stages of the data lifecycle. Like most technologies, there are related means to do this, but with enough differences to cause confusion. For that reason, we’re going to be looking at:

Read More