What is Data De-Duping (De-Duplication) and what does it mean to your business?

Data De-Duplication

In the technology world of increasingly large amounts of data, the challenge of how to manage it has become a hot issue for many organizations.  The obvious problem of where to store the data is compounded by logistical issues of how to back it up, the performance problems with large data stores and the limitations inherent in some software such as Exchange database stores.

Enter Data Stubbing and Deduplication.  Data Deduplication (you might also hear it referred to as “Data De-duping” or simply “De-duping) is can achieve and end result similar to data compression but using a different system to get there.  This has some compelling advantages for organizations small and mid-sized organizations that want to avoid logistical management nightmares ever increasing data collection.   The process of de-duping looks for redundancy (multiple instances of the same data) in very large sequences of data, such as SQL databases, e-mail databases (i.e. your Exchange Server), and file servers -and then references the first uniquely stored version of that data rather than storing another copy of it.  The reference creates a pointer called “stubbing”.

Here s a real world example of data de-duplication: let’s say that a certain document – perhaps a PDF – was sent/received as an email attachment 50 times by several different people in your organization. Your email server then might have that same one megabyte PDF referenced and stored 50 different times. If the email server is simply backed up or archived, all 50 instances of that PDF attachment are saved, requiring 50 megabytes of storage space. With data de-duplication, only ONE instance of the PDF attachment is actually stored. Each reference of that PDF attachment thereafter is pointed back to the one saved copy – thus requiring only one megabyte of storage space. In this example we are only talking about 49 megabytes of storage space savings, but imagine this happening on a mail server with 100 users that send thousands of emails over the course of several years?

So what are the benefits of de-duping? The first (and obvious) benefit is the physical amount of storage space that can be saved on your server (as the example above demonstrates). Aside from that, there are many other key benefits of de-duping:

  • Speed up data retrieval and improve your efficiency when accessing crucial business information.
  • Saves money by limiting hard disk expenditures (the number of hard drives you need to buy for your data)
  • Keeps your hard disks in operation longer by improving their efficiency.
  • Saves time and money on backups by reducing amount of data that needs to be backed up
  • Has positive “Green” implications by reducing power requirements for storing unnecessary duplicate data

Interested in learning more about data de-duping and what it means for your business?  The Launch Pad offers both hosted and on premise strategies for both email and document management.  Contact us for a free technology assessment or 813-920-0788 x210 so we can help you identify the right solution for your organization.

Ryan Montague
Sr. Marketing Manager

Leave a Reply

Your email address will not be published. Required fields are marked *