If you search around online, you might find different technical documents describing deduplication and how it works, but most of these documents are fairly technical. I thought it might be helpful to describe deduplication in a way that would make sense to anyone who wants to understand what it does.
NetApp’s deduplication (also referred to also referred to as A-SIS) is a storage efficiency feature. Storage efficiency simply means that NetApp uses this offering to help you maximize the amount of available free space on your storage system. Which in turn means that you spend less money on disk drives.
Without going into a huge amount of technical detail, I will give you an example. Let’s say you have a version controlled document and there are 10 versions of that document on your storage system. If each page were 1 MB in size, each document would be a total of 10 MB in size. Multiply 10 MB by 10 documents and that’s 100 MB of total space used to store multiple versions of the same document.
If only one page is different between each version of the document, and you only saved changes for each version and not the entire document, then the first document would be 10 MB and each revision would be 1 MB, making your total storage needs 19 MB instead of 100 MB. The process of reducing the total storage space required for these documents from 100 MB to 19MB is deduplication.
Deduplication looks at each version of the document, saves only the unique content from each revision, and uses metadata to point to the original content that these documents have in common. So when you retrieve a unique version of the file, the file system returns the shared data from the original file along with the unique content from the version of the file you requested.
All of this is really done at block level not file level and there is a lot of additional technical detail as to exactly what happens, but in layperson terms that is how deduplication can help you maximize your available storage space. Keep an eye out for our future blog posts as we explain each of NetApp’s advertised storage efficiencies.
Check out the de-duplication calculator at http://www.dedupecalc.com/ to see the potential cost and space savings you can achieve by using deduplication in your environment.